Overview

Dataset statistics

Number of variables43
Number of observations224463
Missing cells467620
Missing cells (%)4.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory448.5 MiB
Average record size in memory2.0 KiB

Variable types

Numeric11
Categorical32

Warnings

Age is highly correlated with VeteransBenefitsHigh correlation
IndustryCode is highly correlated with OccupationCode and 2 other fieldsHigh correlation
OccupationCode is highly correlated with IndustryCode and 2 other fieldsHigh correlation
NumOfPersonsWorkForEmployer is highly correlated with IndustryCode and 2 other fieldsHigh correlation
VeteransBenefits is highly correlated with Age and 1 other fieldsHigh correlation
WeeksWorkedInYear is highly correlated with IndustryCode and 3 other fieldsHigh correlation
Age is highly correlated with VeteransBenefitsHigh correlation
IndustryCode is highly correlated with OccupationCode and 3 other fieldsHigh correlation
OccupationCode is highly correlated with IndustryCode and 3 other fieldsHigh correlation
NumOfPersonsWorkForEmployer is highly correlated with IndustryCode and 3 other fieldsHigh correlation
VeteransBenefits is highly correlated with Age and 4 other fieldsHigh correlation
WeeksWorkedInYear is highly correlated with IndustryCode and 3 other fieldsHigh correlation
Age is highly correlated with VeteransBenefitsHigh correlation
IndustryCode is highly correlated with OccupationCode and 2 other fieldsHigh correlation
OccupationCode is highly correlated with IndustryCode and 2 other fieldsHigh correlation
NumOfPersonsWorkForEmployer is highly correlated with IndustryCode and 2 other fieldsHigh correlation
VeteransBenefits is highly correlated with Age and 1 other fieldsHigh correlation
WeeksWorkedInYear is highly correlated with IndustryCode and 3 other fieldsHigh correlation
IndustryCode is highly correlated with NumOfPersonsWorkForEmployer and 9 other fieldsHigh correlation
Year is highly correlated with LiveInThisHouse1YearAgo and 1 other fieldsHigh correlation
HispanicOrigin is highly correlated with CntryOfBirthFather and 3 other fieldsHigh correlation
WagePerHour is highly correlated with MemberOfALaborUnionHigh correlation
NumOfPersonsWorkForEmployer is highly correlated with IndustryCode and 10 other fieldsHigh correlation
MajorOccupationCode is highly correlated with IndustryCode and 11 other fieldsHigh correlation
Education is highly correlated with NumOfPersonsWorkForEmployer and 12 other fieldsHigh correlation
FamilyMembersUnder18 is highly correlated with IndustryCode and 11 other fieldsHigh correlation
MigPrevResInSunbelt is highly correlated with MigCodeMoveWithinReg and 5 other fieldsHigh correlation
CntryOfBirthFather is highly correlated with HispanicOrigin and 4 other fieldsHigh correlation
DetailedHholdAndFamStat is highly correlated with NumOfPersonsWorkForEmployer and 12 other fieldsHigh correlation
MigCodeMoveWithinReg is highly correlated with MigPrevResInSunbelt and 5 other fieldsHigh correlation
Sex is highly correlated with DetailedHholdAndFamStatHigh correlation
FillIncVeteransAdmin is highly correlated with VeteransBenefitsHigh correlation
EnrollInEdUInstlastWk is highly correlated with Education and 2 other fieldsHigh correlation
ReasonForUnemployment is highly correlated with ClassOfWorkerHigh correlation
CntryOfBirthMother is highly correlated with HispanicOrigin and 4 other fieldsHigh correlation
MemberOfALaborUnion is highly correlated with WagePerHour and 1 other fieldsHigh correlation
MajorIndustryCode is highly correlated with IndustryCode and 12 other fieldsHigh correlation
MigCodeChangeInReg is highly correlated with MigPrevResInSunbelt and 5 other fieldsHigh correlation
LiveInThisHouse1YearAgo is highly correlated with Year and 7 other fieldsHigh correlation
CntryOfBirthSelf is highly correlated with HispanicOrigin and 4 other fieldsHigh correlation
Citizenship is highly correlated with HispanicOrigin and 3 other fieldsHigh correlation
Race is highly correlated with CntryOfBirthFather and 2 other fieldsHigh correlation
StateOfPreviousResidence is highly correlated with MigPrevResInSunbelt and 5 other fieldsHigh correlation
MigCodeChangeInMsa is highly correlated with MigPrevResInSunbelt and 5 other fieldsHigh correlation
TaxFilerStat is highly correlated with IndustryCode and 12 other fieldsHigh correlation
FullOrPartTimeEmploymentStat is highly correlated with Year and 2 other fieldsHigh correlation
OccupationCode is highly correlated with IndustryCode and 8 other fieldsHigh correlation
RegionOfPreviousResidence is highly correlated with MigPrevResInSunbelt and 5 other fieldsHigh correlation
WeeksWorkedInYear is highly correlated with IndustryCode and 10 other fieldsHigh correlation
Age is highly correlated with IndustryCode and 14 other fieldsHigh correlation
DetailedHholdSumInHhold is highly correlated with Education and 7 other fieldsHigh correlation
VeteransBenefits is highly correlated with IndustryCode and 13 other fieldsHigh correlation
ClassOfWorker is highly correlated with IndustryCode and 12 other fieldsHigh correlation
MaritalStatus is highly correlated with Education and 6 other fieldsHigh correlation
Year is highly correlated with MigPrevResInSunbelt and 5 other fieldsHigh correlation
HispanicOrigin is highly correlated with CntryOfBirthFather and 1 other fieldsHigh correlation
MajorOccupationCode is highly correlated with MajorIndustryCodeHigh correlation
Education is highly correlated with VeteransBenefitsHigh correlation
FamilyMembersUnder18 is highly correlated with DetailedHholdAndFamStat and 2 other fieldsHigh correlation
MigPrevResInSunbelt is highly correlated with Year and 7 other fieldsHigh correlation
CntryOfBirthFather is highly correlated with HispanicOrigin and 3 other fieldsHigh correlation
DetailedHholdAndFamStat is highly correlated with FamilyMembersUnder18 and 3 other fieldsHigh correlation
MigCodeMoveWithinReg is highly correlated with Year and 7 other fieldsHigh correlation
FillIncVeteransAdmin is highly correlated with VeteransBenefitsHigh correlation
CntryOfBirthMother is highly correlated with HispanicOrigin and 3 other fieldsHigh correlation
MajorIndustryCode is highly correlated with MajorOccupationCodeHigh correlation
MigCodeChangeInReg is highly correlated with Year and 7 other fieldsHigh correlation
LiveInThisHouse1YearAgo is highly correlated with Year and 7 other fieldsHigh correlation
CntryOfBirthSelf is highly correlated with CntryOfBirthFather and 2 other fieldsHigh correlation
Citizenship is highly correlated with CntryOfBirthFather and 2 other fieldsHigh correlation
StateOfPreviousResidence is highly correlated with MigPrevResInSunbelt and 5 other fieldsHigh correlation
MigCodeChangeInMsa is highly correlated with Year and 7 other fieldsHigh correlation
TaxFilerStat is highly correlated with DetailedHholdAndFamStat and 1 other fieldsHigh correlation
FullOrPartTimeEmploymentStat is highly correlated with Year and 5 other fieldsHigh correlation
RegionOfPreviousResidence is highly correlated with MigPrevResInSunbelt and 5 other fieldsHigh correlation
DetailedHholdSumInHhold is highly correlated with FamilyMembersUnder18 and 2 other fieldsHigh correlation
VeteransBenefits is highly correlated with Education and 5 other fieldsHigh correlation
MigCodeChangeInMsa has 112154 (50.0%) missing values Missing
MigCodeChangeInReg has 112154 (50.0%) missing values Missing
MigCodeMoveWithinReg has 112154 (50.0%) missing values Missing
MigPrevResInSunbelt has 112154 (50.0%) missing values Missing
CntryOfBirthFather has 7498 (3.3%) missing values Missing
CntryOfBirthMother has 6843 (3.0%) missing values Missing
CntryOfBirthSelf has 3869 (1.7%) missing values Missing
DividendsFromStocks is highly skewed (γ1 = 27.45959869) Skewed
ID is uniformly distributed Uniform
ID has unique values Unique
Age has 3205 (1.4%) zeros Zeros
IndustryCode has 113109 (50.4%) zeros Zeros
OccupationCode has 113109 (50.4%) zeros Zeros
WagePerHour has 211831 (94.4%) zeros Zeros
CapitalGains has 216173 (96.3%) zeros Zeros
CapitalLosses has 220033 (98.0%) zeros Zeros
DividendsFromStocks has 200794 (89.5%) zeros Zeros
NumOfPersonsWorkForEmployer has 107852 (48.0%) zeros Zeros
WeeksWorkedInYear has 107852 (48.0%) zeros Zeros

Reproduction

Analysis started2021-12-29 21:42:43.143496
Analysis finished2021-12-29 21:43:47.259025
Duration1 minute and 4.12 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

ID
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct224463
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean149769.4031
Minimum1
Maximum299285
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:47.306921image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile14963.1
Q174988.5
median149703
Q3224589.5
95-th percentile284239.9
Maximum299285
Range299284
Interquartile range (IQR)149601

Descriptive statistics

Standard deviation86375.29619
Coefficient of variation (CV)0.57672191
Kurtosis-1.200730716
Mean149769.4031
Median Absolute Deviation (MAD)74804
Skewness-0.002084998314
Sum3.361768952 × 1010
Variance7460691792
MonotonicityNot monotonic
2021-12-30T00:43:47.372988image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
403271
 
< 0.1%
2535861
 
< 0.1%
864211
 
< 0.1%
2258321
 
< 0.1%
1118851
 
< 0.1%
689611
 
< 0.1%
271411
 
< 0.1%
2792921
 
< 0.1%
2582491
 
< 0.1%
2275021
 
< 0.1%
Other values (224453)224453
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
121
< 0.1%
ValueCountFrequency (%)
2992851
< 0.1%
2992831
< 0.1%
2992821
< 0.1%
2992811
< 0.1%
2992801
< 0.1%
2992791
< 0.1%
2992751
< 0.1%
2992731
< 0.1%
2992721
< 0.1%
2992711
< 0.1%

Age
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct91
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.52253601
Minimum0
Maximum90
Zeros3205
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:47.444182image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q115
median33
Q350
95-th percentile75
Maximum90
Range90
Interquartile range (IQR)35

Descriptive statistics

Standard deviation22.31026591
Coefficient of variation (CV)0.646252231
Kurtosis-0.7343207482
Mean34.52253601
Median Absolute Deviation (MAD)17
Skewness0.371608015
Sum7749032
Variance497.7479651
MonotonicityNot monotonic
2021-12-30T00:43:47.511771image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
343904
 
1.7%
333841
 
1.7%
353806
 
1.7%
43790
 
1.7%
53784
 
1.7%
33712
 
1.7%
313711
 
1.7%
383702
 
1.6%
363672
 
1.6%
373626
 
1.6%
Other values (81)186915
83.3%
ValueCountFrequency (%)
03205
1.4%
13474
1.5%
23591
1.6%
33712
1.7%
43790
1.7%
53784
1.7%
63538
1.6%
73585
1.6%
83561
1.6%
93487
1.6%
ValueCountFrequency (%)
90816
0.4%
89236
 
0.1%
88286
 
0.1%
87357
0.2%
86390
0.2%
85454
0.2%
84585
0.3%
83614
0.3%
82710
0.3%
81795
0.4%

ClassOfWorker
Categorical

HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.2 MiB
Not in universe
112609 
Private
81152 
Self-employed-not incorporated
 
9593
Local government
 
8753
State government
 
4757
Other values (4)
 
7599

Length

Max length31
Median length16
Mean length14.02185215
Min length8

Characters and Unicode

Total characters3147387
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Private
3rd row Self-employed-incorporated
4th row Not in universe
5th row Private

Common Values

ValueCountFrequency (%)
Not in universe112609
50.2%
Private81152
36.2%
Self-employed-not incorporated9593
 
4.3%
Local government8753
 
3.9%
State government4757
 
2.1%
Self-employed-incorporated3654
 
1.6%
Federal government3268
 
1.5%
Never worked500
 
0.2%
Without pay177
 
0.1%

Length

2021-12-30T00:43:47.628118image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:47.668253image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
not112609
23.6%
in112609
23.6%
universe112609
23.6%
private81152
17.0%
government16778
 
3.5%
self-employed-not9593
 
2.0%
incorporated9593
 
2.0%
local8753
 
1.8%
state4757
 
1.0%
self-employed-incorporated3654
 
0.8%
Other values (5)4622
 
1.0%

Most occurring characters

ValueCountFrequency (%)
476729
15.1%
e405707
12.9%
i319794
10.2%
n281614
8.9%
t243247
7.7%
r241301
7.7%
v211039
 
6.7%
o188151
 
6.0%
N113109
 
3.6%
u112786
 
3.6%
Other values (19)553910
17.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2419701
76.9%
Space Separator476729
 
15.1%
Uppercase Letter224463
 
7.1%
Dash Punctuation26494
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e405707
16.8%
i319794
13.2%
n281614
11.6%
t243247
10.1%
r241301
10.0%
v211039
8.7%
o188151
7.8%
u112786
 
4.7%
s112609
 
4.7%
a111354
 
4.6%
Other values (11)192099
7.9%
Uppercase Letter
ValueCountFrequency (%)
N113109
50.4%
P81152
36.2%
S18004
 
8.0%
L8753
 
3.9%
F3268
 
1.5%
W177
 
0.1%
Space Separator
ValueCountFrequency (%)
476729
100.0%
Dash Punctuation
ValueCountFrequency (%)
-26494
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2644164
84.0%
Common503223
 
16.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e405707
15.3%
i319794
12.1%
n281614
10.7%
t243247
9.2%
r241301
9.1%
v211039
8.0%
o188151
7.1%
N113109
 
4.3%
u112786
 
4.3%
s112609
 
4.3%
Other values (17)414807
15.7%
Common
ValueCountFrequency (%)
476729
94.7%
-26494
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII3147387
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
476729
15.1%
e405707
12.9%
i319794
10.2%
n281614
8.9%
t243247
7.7%
r241301
7.7%
v211039
 
6.7%
o188151
 
6.0%
N113109
 
3.6%
u112786
 
3.6%
Other values (19)553910
17.6%

IndustryCode
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.33978874
Minimum0
Maximum51
Zeros113109
Zeros (%)50.4%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:47.732404image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q333
95-th percentile44
Maximum51
Range51
Interquartile range (IQR)33

Descriptive statistics

Standard deviation18.04723881
Coefficient of variation (CV)1.176498524
Kurtosis-1.501504475
Mean15.33978874
Median Absolute Deviation (MAD)0
Skewness0.5163413471
Sum3443215
Variance325.7028286
MonotonicityNot monotonic
2021-12-30T00:43:47.802291image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0113109
50.4%
3319356
 
8.6%
439344
 
4.2%
46855
 
3.1%
425218
 
2.3%
454921
 
2.2%
294790
 
2.1%
374690
 
2.1%
414319
 
1.9%
324056
 
1.8%
Other values (42)47805
21.3%
ValueCountFrequency (%)
0113109
50.4%
1915
 
0.4%
22507
 
1.1%
3682
 
0.3%
46855
 
3.1%
5628
 
0.3%
6597
 
0.3%
7506
 
0.2%
8604
 
0.3%
91122
 
0.5%
ValueCountFrequency (%)
5139
 
< 0.1%
501928
 
0.9%
49657
 
0.3%
48694
 
0.3%
471851
 
0.8%
46204
 
0.1%
454921
2.2%
442827
 
1.3%
439344
4.2%
425218
2.3%

OccupationCode
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct47
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.33823392
Minimum0
Maximum46
Zeros113109
Zeros (%)50.4%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:47.869356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q326
95-th percentile38
Maximum46
Range46
Interquartile range (IQR)26

Descriptive statistics

Standard deviation14.46891626
Coefficient of variation (CV)1.276117283
Kurtosis-0.907283084
Mean11.33823392
Median Absolute Deviation (MAD)0
Skewness0.8240818021
Sum2545014
Variance209.3495378
MonotonicityNot monotonic
2021-12-30T00:43:47.935436image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
0113109
50.4%
29814
 
4.4%
268812
 
3.9%
196193
 
2.8%
295818
 
2.6%
364673
 
2.1%
344553
 
2.0%
104058
 
1.8%
163859
 
1.7%
333759
 
1.7%
Other values (37)59815
26.6%
ValueCountFrequency (%)
0113109
50.4%
1628
 
0.3%
29814
 
4.4%
33639
 
1.6%
41563
 
0.7%
5958
 
0.4%
6477
 
0.2%
7831
 
0.4%
82387
 
1.1%
9824
 
0.4%
ValueCountFrequency (%)
4639
 
< 0.1%
45168
 
0.1%
441798
0.8%
431571
0.7%
422121
0.9%
411807
0.8%
40720
 
0.3%
391145
 
0.5%
383420
1.5%
372519
1.1%

Education
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.5 MiB
High school graduate
54559 
Children
53305 
Some college but no degree
31216 
Bachelors degree(BA AB BS)
22214 
7th and 8th grade
9027 
Other values (12)
54142 

Length

Max length39
Median length21
Mean length19.85110241
Min length9

Characters and Unicode

Total characters4455838
Distinct characters47
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 10th grade
2nd row 11th grade
3rd row High school graduate
4th row High school graduate
5th row Masters degree(MA MS MEng MEd MSW MBA)

Common Values

ValueCountFrequency (%)
High school graduate54559
24.3%
Children53305
23.7%
Some college but no degree31216
13.9%
Bachelors degree(BA AB BS)22214
9.9%
7th and 8th grade9027
 
4.0%
10th grade8460
 
3.8%
11th grade7805
 
3.5%
Masters degree(MA MS MEng MEd MSW MBA)7314
 
3.3%
9th grade7021
 
3.1%
Associates degree-occup /vocational6074
 
2.7%
Other values (7)17468
 
7.8%

Length

2021-12-30T00:43:48.058091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
school56574
 
8.2%
high54559
 
7.9%
graduate54559
 
7.9%
children53305
 
7.7%
grade41510
 
6.0%
no33676
 
4.9%
degree33231
 
4.8%
some31216
 
4.5%
college31216
 
4.5%
but31216
 
4.5%
Other values (42)268690
39.0%

Most occurring characters

ValueCountFrequency (%)
689752
15.5%
e515876
 
11.6%
o278580
 
6.3%
r274937
 
6.2%
g269066
 
6.0%
d253578
 
5.7%
h242424
 
5.4%
a231488
 
5.2%
l203059
 
4.6%
t170204
 
3.8%
Other values (37)1326874
29.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3152526
70.8%
Space Separator689752
 
15.5%
Uppercase Letter451486
 
10.1%
Decimal Number79147
 
1.8%
Open Punctuation32980
 
0.7%
Close Punctuation32980
 
0.7%
Dash Punctuation10893
 
0.2%
Other Punctuation6074
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e515876
16.4%
o278580
8.8%
r274937
8.7%
g269066
8.5%
d253578
8.0%
h242424
7.7%
a231488
7.3%
l203059
 
6.4%
t170204
 
5.4%
c150194
 
4.8%
Other values (9)563120
17.9%
Uppercase Letter
ValueCountFrequency (%)
B98185
21.7%
S70073
15.5%
A69949
15.5%
M55228
12.2%
H54559
12.1%
C53305
11.8%
E16065
 
3.6%
D14386
 
3.2%
W7314
 
1.6%
L4940
 
1.1%
Other values (3)7482
 
1.7%
Decimal Number
ValueCountFrequency (%)
129469
37.2%
79027
 
11.4%
89027
 
11.4%
08460
 
10.7%
97021
 
8.9%
24489
 
5.7%
53798
 
4.8%
63798
 
4.8%
32029
 
2.6%
42029
 
2.6%
Space Separator
ValueCountFrequency (%)
689752
100.0%
Open Punctuation
ValueCountFrequency (%)
(32980
100.0%
Close Punctuation
ValueCountFrequency (%)
)32980
100.0%
Dash Punctuation
ValueCountFrequency (%)
-10893
100.0%
Other Punctuation
ValueCountFrequency (%)
/6074
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3604012
80.9%
Common851826
 
19.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e515876
14.3%
o278580
 
7.7%
r274937
 
7.6%
g269066
 
7.5%
d253578
 
7.0%
h242424
 
6.7%
a231488
 
6.4%
l203059
 
5.6%
t170204
 
4.7%
c150194
 
4.2%
Other values (22)1014606
28.2%
Common
ValueCountFrequency (%)
689752
81.0%
(32980
 
3.9%
)32980
 
3.9%
129469
 
3.5%
-10893
 
1.3%
79027
 
1.1%
89027
 
1.1%
08460
 
1.0%
97021
 
0.8%
/6074
 
0.7%
Other values (5)16143
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII4455838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
689752
15.5%
e515876
 
11.6%
o278580
 
6.3%
r274937
 
6.2%
g269066
 
6.0%
d253578
 
5.7%
h242424
 
5.4%
a231488
 
5.2%
l203059
 
4.6%
t170204
 
3.8%
Other values (37)1326874
29.8%

WagePerHour
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct1269
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.97750632
Minimum0
Maximum9900
Zeros211831
Zeros (%)94.4%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:48.125290image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile480
Maximum9900
Range9900
Interquartile range (IQR)0

Descriptive statistics

Standard deviation273.0454213
Coefficient of variation (CV)4.966493383
Kurtosis152.8881264
Mean54.97750632
Median Absolute Deviation (MAD)0
Skewness8.874793148
Sum12340416
Variance74553.80209
MonotonicityNot monotonic
2021-12-30T00:43:48.196245image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0211831
94.4%
500826
 
0.4%
600626
 
0.3%
700572
 
0.3%
800559
 
0.2%
1000460
 
0.2%
425419
 
0.2%
900345
 
0.2%
550315
 
0.1%
1200282
 
0.1%
Other values (1259)8228
 
3.7%
ValueCountFrequency (%)
0211831
94.4%
201
 
< 0.1%
752
 
< 0.1%
1009
 
< 0.1%
1251
 
< 0.1%
1351
 
< 0.1%
1509
 
< 0.1%
1701
 
< 0.1%
1731
 
< 0.1%
1781
 
< 0.1%
ValueCountFrequency (%)
99001
 
< 0.1%
98002
 
< 0.1%
94002
 
< 0.1%
90001
 
< 0.1%
88311
 
< 0.1%
88002
 
< 0.1%
86001
 
< 0.1%
85001
 
< 0.1%
83001
 
< 0.1%
80006
< 0.1%

EnrollInEdUInstlastWk
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.6 MiB
Not in universe
210355 
High school
 
7776
College or university
 
6332

Length

Max length22
Median length16
Mean length16.03068657
Min length12

Characters and Unicode

Total characters3598296
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe210355
93.7%
High school7776
 
3.5%
College or university6332
 
2.8%

Length

2021-12-30T00:43:48.508471image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:48.546817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
not210355
31.6%
in210355
31.6%
universe210355
31.6%
high7776
 
1.2%
school7776
 
1.2%
college6332
 
1.0%
or6332
 
1.0%
university6332
 
1.0%

Most occurring characters

ValueCountFrequency (%)
665613
18.5%
i441150
12.3%
e439706
12.2%
n427042
11.9%
o238571
 
6.6%
s224463
 
6.2%
r223019
 
6.2%
t216687
 
6.0%
u216687
 
6.0%
v216687
 
6.0%
Other values (8)288671
8.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2708220
75.3%
Space Separator665613
 
18.5%
Uppercase Letter224463
 
6.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i441150
16.3%
e439706
16.2%
n427042
15.8%
o238571
8.8%
s224463
8.3%
r223019
8.2%
t216687
8.0%
u216687
8.0%
v216687
8.0%
l20440
 
0.8%
Other values (4)43768
 
1.6%
Uppercase Letter
ValueCountFrequency (%)
N210355
93.7%
H7776
 
3.5%
C6332
 
2.8%
Space Separator
ValueCountFrequency (%)
665613
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2932683
81.5%
Common665613
 
18.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
i441150
15.0%
e439706
15.0%
n427042
14.6%
o238571
8.1%
s224463
7.7%
r223019
7.6%
t216687
7.4%
u216687
7.4%
v216687
7.4%
N210355
7.2%
Other values (7)78316
 
2.7%
Common
ValueCountFrequency (%)
665613
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3598296
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
665613
18.5%
i441150
12.3%
e439706
12.2%
n427042
11.9%
o238571
 
6.6%
s224463
 
6.2%
r223019
 
6.2%
t216687
 
6.0%
u216687
 
6.0%
v216687
 
6.0%
Other values (8)288671
8.0%

MaritalStatus
Categorical

HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.7 MiB
Never married
97232 
Married-civilian spouse present
94770 
Divorced
14395 
Widowed
11771 
Separated
 
3856
Other values (2)
 
2439

Length

Max length32
Median length14
Mean length20.99920254
Min length8

Characters and Unicode

Total characters4713544
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Married-civilian spouse present
2nd row Never married
3rd row Married-civilian spouse present
4th row Widowed
5th row Never married

Common Values

ValueCountFrequency (%)
Never married97232
43.3%
Married-civilian spouse present94770
42.2%
Divorced14395
 
6.4%
Widowed11771
 
5.2%
Separated3856
 
1.7%
Married-spouse absent1696
 
0.8%
Married-A F spouse present743
 
0.3%

Length

2021-12-30T00:43:48.636678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:48.675420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
never97232
18.9%
married97232
18.9%
spouse95513
18.5%
present95513
18.5%
married-civilian94770
18.4%
divorced14395
 
2.8%
widowed11771
 
2.3%
separated3856
 
0.7%
married-spouse1696
 
0.3%
absent1696
 
0.3%
Other values (2)1486
 
0.3%

Most occurring characters

ValueCountFrequency (%)
e712714
15.1%
r599878
12.7%
515160
10.9%
i504917
10.7%
a298619
 
6.3%
s291627
 
6.2%
d236234
 
5.0%
v206397
 
4.4%
p196578
 
4.2%
n191979
 
4.1%
Other values (16)959441
20.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3875226
82.2%
Space Separator515160
 
10.9%
Uppercase Letter225949
 
4.8%
Dash Punctuation97209
 
2.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e712714
18.4%
r599878
15.5%
i504917
13.0%
a298619
7.7%
s291627
7.5%
d236234
 
6.1%
v206397
 
5.3%
p196578
 
5.1%
n191979
 
5.0%
o123375
 
3.2%
Other values (7)512908
13.2%
Uppercase Letter
ValueCountFrequency (%)
N97232
43.0%
M97209
43.0%
D14395
 
6.4%
W11771
 
5.2%
S3856
 
1.7%
A743
 
0.3%
F743
 
0.3%
Space Separator
ValueCountFrequency (%)
515160
100.0%
Dash Punctuation
ValueCountFrequency (%)
-97209
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4101175
87.0%
Common612369
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e712714
17.4%
r599878
14.6%
i504917
12.3%
a298619
7.3%
s291627
 
7.1%
d236234
 
5.8%
v206397
 
5.0%
p196578
 
4.8%
n191979
 
4.7%
o123375
 
3.0%
Other values (14)738857
18.0%
Common
ValueCountFrequency (%)
515160
84.1%
-97209
 
15.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII4713544
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e712714
15.1%
r599878
12.7%
515160
10.9%
i504917
10.7%
a298619
 
6.3%
s291627
 
6.2%
d236234
 
5.0%
v206397
 
4.4%
p196578
 
4.2%
n191979
 
4.1%
Other values (16)959441
20.4%

MajorIndustryCode
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.4 MiB
Not in universe or children
113109 
Retail trade
19356 
Manufacturing-durable goods
 
10083
Education
 
9344
Manufacturing-nondurable goods
 
7716
Other values (19)
64855 

Length

Max length36
Median length28
Mean length24.37834298
Min length7

Characters and Unicode

Total characters5472036
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe or children
2nd row Manufacturing-nondurable goods
3rd row Personal services except private HH
4th row Not in universe or children
5th row Other professional services

Common Values

ValueCountFrequency (%)
Not in universe or children113109
50.4%
Retail trade19356
 
8.6%
Manufacturing-durable goods10083
 
4.5%
Education9344
 
4.2%
Manufacturing-nondurable goods7716
 
3.4%
Finance insurance and real estate6928
 
3.1%
Construction6855
 
3.1%
Business and repair services6577
 
2.9%
Medical except hospital5218
 
2.3%
Public administration5130
 
2.3%
Other values (14)34147
 
15.2%

Length

2021-12-30T00:43:48.794114image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not113109
13.8%
universe113109
13.8%
or113109
13.8%
children113109
13.8%
in113109
13.8%
services24363
 
3.0%
trade23412
 
2.9%
retail19356
 
2.4%
goods17799
 
2.2%
and15025
 
1.8%
Other values (34)152406
18.6%

Most occurring characters

ValueCountFrequency (%)
817906
14.9%
e554544
10.1%
i511219
 
9.3%
n501646
 
9.2%
r499514
 
9.1%
o341954
 
6.2%
t272236
 
5.0%
s262440
 
4.8%
a214724
 
3.9%
c211802
 
3.9%
Other values (28)1284051
23.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4405231
80.5%
Space Separator817906
 
14.9%
Uppercase Letter231100
 
4.2%
Dash Punctuation17799
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e554544
12.6%
i511219
11.6%
n501646
11.4%
r499514
11.3%
o341954
7.8%
t272236
 
6.2%
s262440
 
6.0%
a214724
 
4.9%
c211802
 
4.8%
u210611
 
4.8%
Other values (11)824541
18.7%
Uppercase Letter
ValueCountFrequency (%)
N113109
48.9%
M23699
 
10.3%
R19356
 
8.4%
E11189
 
4.8%
H10917
 
4.7%
P9533
 
4.1%
C8178
 
3.5%
F7171
 
3.1%
B6577
 
2.8%
O4921
 
2.1%
Other values (5)16450
 
7.1%
Space Separator
ValueCountFrequency (%)
817906
100.0%
Dash Punctuation
ValueCountFrequency (%)
-17799
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4636331
84.7%
Common835705
 
15.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e554544
12.0%
i511219
11.0%
n501646
10.8%
r499514
10.8%
o341954
 
7.4%
t272236
 
5.9%
s262440
 
5.7%
a214724
 
4.6%
c211802
 
4.6%
u210611
 
4.5%
Other values (26)1055641
22.8%
Common
ValueCountFrequency (%)
817906
97.9%
-17799
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII5472036
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
817906
14.9%
e554544
10.1%
i511219
 
9.3%
n501646
 
9.2%
r499514
 
9.1%
o341954
 
6.2%
t272236
 
5.0%
s262440
 
4.8%
a214724
 
3.9%
c211802
 
3.9%
Other values (28)1284051
23.5%

MajorOccupationCode
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.6 MiB
Not in universe
113109 
Adm support including clerical
16561 
Professional specialty
15559 
Executive admin and managerial
14081 
Other service
13723 
Other values (10)
51430 

Length

Max length38
Median length16
Mean length20.76106084
Min length6

Characters and Unicode

Total characters4660090
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Transportation and material moving
3rd row Other service
4th row Not in universe
5th row Executive admin and managerial

Common Values

ValueCountFrequency (%)
Not in universe113109
50.4%
Adm support including clerical16561
 
7.4%
Professional specialty15559
 
6.9%
Executive admin and managerial14081
 
6.3%
Other service13723
 
6.1%
Sales13363
 
6.0%
Precision production craft & repair11923
 
5.3%
Machine operators assmblrs & inspctrs7192
 
3.2%
Handlers equip cleaners etc 4648
 
2.1%
Transportation and material moving4565
 
2.0%
Other values (5)9739
 
4.3%

Length

2021-12-30T00:43:48.914859image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not113109
16.1%
universe113109
16.1%
in113109
16.1%
and25551
 
3.6%
support19929
 
2.8%
19115
 
2.7%
including16561
 
2.4%
adm16561
 
2.4%
clerical16561
 
2.4%
specialty15559
 
2.2%
Other values (33)231293
33.0%

Most occurring characters

ValueCountFrequency (%)
705105
15.1%
i466341
10.0%
e461606
9.9%
n403643
 
8.7%
r337706
 
7.2%
s292728
 
6.3%
t244549
 
5.2%
o235295
 
5.0%
a227088
 
4.9%
u181171
 
3.9%
Other values (24)1104858
23.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3711368
79.6%
Space Separator705105
 
15.1%
Uppercase Letter224502
 
4.8%
Other Punctuation19115
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i466341
12.6%
e461606
12.4%
n403643
10.9%
r337706
9.1%
s292728
7.9%
t244549
 
6.6%
o235295
 
6.3%
a227088
 
6.1%
u181171
 
4.9%
c163940
 
4.4%
Other values (12)697301
18.8%
Uppercase Letter
ValueCountFrequency (%)
N113109
50.4%
P30277
 
13.5%
A16600
 
7.4%
E14081
 
6.3%
O13723
 
6.1%
S13363
 
6.0%
T7933
 
3.5%
M7192
 
3.2%
H4648
 
2.1%
F3576
 
1.6%
Space Separator
ValueCountFrequency (%)
705105
100.0%
Other Punctuation
ValueCountFrequency (%)
&19115
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3935870
84.5%
Common724220
 
15.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
i466341
11.8%
e461606
11.7%
n403643
10.3%
r337706
 
8.6%
s292728
 
7.4%
t244549
 
6.2%
o235295
 
6.0%
a227088
 
5.8%
u181171
 
4.6%
c163940
 
4.2%
Other values (22)921803
23.4%
Common
ValueCountFrequency (%)
705105
97.4%
&19115
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII4660090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
705105
15.1%
i466341
10.0%
e461606
9.9%
n403643
 
8.7%
r337706
 
7.2%
s292728
 
6.3%
t244549
 
5.2%
o235295
 
5.0%
a227088
 
4.9%
u181171
 
3.9%
Other values (24)1104858
23.7%

Race
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.7 MiB
White
188125 
Black
22980 
Asian or Pacific Islander
 
6574
Other
 
4190
Amer Indian Aleut or Eskimo
 
2594

Length

Max length28
Median length6
Mean length6.839995901
Min length6

Characters and Unicode

Total characters1535326
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row White
2nd row White
3rd row White
4th row White
5th row White

Common Values

ValueCountFrequency (%)
White188125
83.8%
Black22980
 
10.2%
Asian or Pacific Islander6574
 
2.9%
Other4190
 
1.9%
Amer Indian Aleut or Eskimo2594
 
1.2%

Length

2021-12-30T00:43:49.027104image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:49.062353image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
white188125
73.9%
black22980
 
9.0%
or9168
 
3.6%
asian6574
 
2.6%
pacific6574
 
2.6%
islander6574
 
2.6%
other4190
 
1.6%
amer2594
 
1.0%
indian2594
 
1.0%
aleut2594
 
1.0%

Most occurring characters

ValueCountFrequency (%)
254561
16.6%
i213035
13.9%
e204077
13.3%
t194909
12.7%
h192315
12.5%
W188125
12.3%
a45296
 
3.0%
c36128
 
2.4%
l32148
 
2.1%
k25574
 
1.7%
Other values (14)149158
9.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1035372
67.4%
Space Separator254561
 
16.6%
Uppercase Letter245393
 
16.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i213035
20.6%
e204077
19.7%
t194909
18.8%
h192315
18.6%
a45296
 
4.4%
c36128
 
3.5%
l32148
 
3.1%
k25574
 
2.5%
r22526
 
2.2%
n18336
 
1.8%
Other values (6)51028
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
W188125
76.7%
B22980
 
9.4%
A11762
 
4.8%
I9168
 
3.7%
P6574
 
2.7%
O4190
 
1.7%
E2594
 
1.1%
Space Separator
ValueCountFrequency (%)
254561
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1280765
83.4%
Common254561
 
16.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
i213035
16.6%
e204077
15.9%
t194909
15.2%
h192315
15.0%
W188125
14.7%
a45296
 
3.5%
c36128
 
2.8%
l32148
 
2.5%
k25574
 
2.0%
B22980
 
1.8%
Other values (13)126178
9.9%
Common
ValueCountFrequency (%)
254561
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1535326
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
254561
16.6%
i213035
13.9%
e204077
13.3%
t194909
12.7%
h192315
12.5%
W188125
12.3%
a45296
 
3.0%
c36128
 
2.4%
l32148
 
2.1%
k25574
 
1.7%
Other values (14)149158
9.7%

HispanicOrigin
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.6 MiB
All other
193369 
Mexican-American
 
9030
Mexican (Mexicano)
 
8222
Central or South American
 
4417
Puerto Rican
 
3676
Other values (5)
 
5749

Length

Max length26
Median length10
Mean length10.9711712
Min length3

Characters and Unicode

Total characters2462622
Distinct characters31
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row All other
2nd row All other
3rd row All other
4th row All other
5th row All other

Common Values

ValueCountFrequency (%)
All other193369
86.1%
Mexican-American9030
 
4.0%
Mexican (Mexicano)8222
 
3.7%
Central or South American4417
 
2.0%
Puerto Rican3676
 
1.6%
Other Spanish2778
 
1.2%
Cuban1323
 
0.6%
NA964
 
0.4%
Do not know345
 
0.2%
Chicano339
 
0.2%

Length

2021-12-30T00:43:49.167749image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:49.211141image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
other196147
43.9%
all193369
43.3%
mexican-american9030
 
2.0%
mexican8222
 
1.8%
mexicano8222
 
1.8%
central4417
 
1.0%
or4417
 
1.0%
south4417
 
1.0%
american4417
 
1.0%
puerto3676
 
0.8%
Other values (8)10115
 
2.3%

Most occurring characters

ValueCountFrequency (%)
446449
18.1%
l391155
15.9%
e243161
9.9%
r222104
9.0%
o215475
8.7%
t209002
8.5%
A207780
8.4%
h203681
8.3%
n52144
 
2.1%
a51454
 
2.1%
Other values (21)220217
8.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1732732
70.4%
Space Separator446449
 
18.1%
Uppercase Letter257967
 
10.5%
Dash Punctuation9030
 
0.4%
Open Punctuation8222
 
0.3%
Close Punctuation8222
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l391155
22.6%
e243161
14.0%
r222104
12.8%
o215475
12.4%
t209002
12.1%
h203681
11.8%
n52144
 
3.0%
a51454
 
3.0%
i45714
 
2.6%
c42936
 
2.5%
Other values (8)55906
 
3.2%
Uppercase Letter
ValueCountFrequency (%)
A207780
80.5%
M25474
 
9.9%
S7195
 
2.8%
C6079
 
2.4%
P3676
 
1.4%
R3676
 
1.4%
O2778
 
1.1%
N964
 
0.4%
D345
 
0.1%
Space Separator
ValueCountFrequency (%)
446449
100.0%
Open Punctuation
ValueCountFrequency (%)
(8222
100.0%
Close Punctuation
ValueCountFrequency (%)
)8222
100.0%
Dash Punctuation
ValueCountFrequency (%)
-9030
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1990699
80.8%
Common471923
 
19.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
l391155
19.6%
e243161
12.2%
r222104
11.2%
o215475
10.8%
t209002
10.5%
A207780
10.4%
h203681
10.2%
n52144
 
2.6%
a51454
 
2.6%
i45714
 
2.3%
Other values (17)149029
 
7.5%
Common
ValueCountFrequency (%)
446449
94.6%
-9030
 
1.9%
(8222
 
1.7%
)8222
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII2462622
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
446449
18.1%
l391155
15.9%
e243161
9.9%
r222104
9.0%
o215475
8.7%
t209002
8.5%
A207780
8.4%
h203681
8.3%
n52144
 
2.1%
a51454
 
2.1%
Other values (21)220217
8.9%

Sex
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size13.5 MiB
Female
116770 
Male
107693 

Length

Max length7
Median length7
Mean length6.040438736
Min length5

Characters and Unicode

Total characters1355855
Distinct characters7
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Male
2nd row Male
3rd row Male
4th row Female
5th row Male

Common Values

ValueCountFrequency (%)
Female116770
52.0%
Male107693
48.0%

Length

2021-12-30T00:43:49.319377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:49.358969image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
female116770
52.0%
male107693
48.0%

Most occurring characters

ValueCountFrequency (%)
e341233
25.2%
224463
16.6%
a224463
16.6%
l224463
16.6%
F116770
 
8.6%
m116770
 
8.6%
M107693
 
7.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter906929
66.9%
Space Separator224463
 
16.6%
Uppercase Letter224463
 
16.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e341233
37.6%
a224463
24.7%
l224463
24.7%
m116770
 
12.9%
Uppercase Letter
ValueCountFrequency (%)
F116770
52.0%
M107693
48.0%
Space Separator
ValueCountFrequency (%)
224463
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1131392
83.4%
Common224463
 
16.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e341233
30.2%
a224463
19.8%
l224463
19.8%
F116770
 
10.3%
m116770
 
10.3%
M107693
 
9.5%
Common
ValueCountFrequency (%)
224463
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1355855
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e341233
25.2%
224463
16.6%
a224463
16.6%
l224463
16.6%
F116770
 
8.6%
m116770
 
8.6%
M107693
 
7.9%

MemberOfALaborUnion
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.4 MiB
Not in universe
203036 
No
 
18066
Yes
 
3361

Length

Max length16
Median length16
Mean length14.7740073
Min length3

Characters and Unicode

Total characters3316218
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe203036
90.5%
No18066
 
8.0%
Yes3361
 
1.5%

Length

2021-12-30T00:43:49.456354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:49.495658image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
not203036
32.2%
in203036
32.2%
universe203036
32.2%
no18066
 
2.9%
yes3361
 
0.5%

Most occurring characters

ValueCountFrequency (%)
630535
19.0%
e409433
12.3%
i406072
12.2%
n406072
12.2%
N221102
 
6.7%
o221102
 
6.7%
s206397
 
6.2%
t203036
 
6.1%
u203036
 
6.1%
v203036
 
6.1%
Other values (2)206397
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2461220
74.2%
Space Separator630535
 
19.0%
Uppercase Letter224463
 
6.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e409433
16.6%
i406072
16.5%
n406072
16.5%
o221102
9.0%
s206397
8.4%
t203036
8.2%
u203036
8.2%
v203036
8.2%
r203036
8.2%
Uppercase Letter
ValueCountFrequency (%)
N221102
98.5%
Y3361
 
1.5%
Space Separator
ValueCountFrequency (%)
630535
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2685683
81.0%
Common630535
 
19.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e409433
15.2%
i406072
15.1%
n406072
15.1%
N221102
8.2%
o221102
8.2%
s206397
7.7%
t203036
7.6%
u203036
7.6%
v203036
7.6%
r203036
7.6%
Common
ValueCountFrequency (%)
630535
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3316218
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
630535
19.0%
e409433
12.3%
i406072
12.2%
n406072
12.2%
N221102
 
6.7%
o221102
 
6.7%
s206397
 
6.2%
t203036
 
6.1%
u203036
 
6.1%
v203036
 
6.1%
Other values (2)206397
 
6.2%

ReasonForUnemployment
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.6 MiB
Not in universe
217541 
Other job loser
 
2355
Re-entrant
 
2323
Job loser - on layoff
 
1108
Job leaver
 
636

Length

Max length22
Median length16
Mean length15.95479433
Min length11

Characters and Unicode

Total characters3581261
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe217541
96.9%
Other job loser2355
 
1.0%
Re-entrant2323
 
1.0%
Job loser - on layoff1108
 
0.5%
Job leaver636
 
0.3%
New entrant500
 
0.2%

Length

2021-12-30T00:43:49.581434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:49.619416image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
not217541
32.5%
in217541
32.5%
universe217541
32.5%
job4099
 
0.6%
loser3463
 
0.5%
other2355
 
0.4%
re-entrant2323
 
0.3%
1108
 
0.2%
on1108
 
0.2%
layoff1108
 
0.2%
Other values (3)1636
 
0.2%

Most occurring characters

ValueCountFrequency (%)
669823
18.7%
e447818
12.5%
n441836
12.3%
i435082
12.1%
o227319
 
6.3%
r226818
 
6.3%
t225542
 
6.3%
s221004
 
6.2%
v218177
 
6.1%
N218041
 
6.1%
Other values (13)249801
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2683544
74.9%
Space Separator669823
 
18.7%
Uppercase Letter224463
 
6.3%
Dash Punctuation3431
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e447818
16.7%
n441836
16.5%
i435082
16.2%
o227319
8.5%
r226818
8.5%
t225542
8.4%
s221004
8.2%
v218177
8.1%
u217541
8.1%
l5207
 
0.2%
Other values (7)17200
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
N218041
97.1%
O2355
 
1.0%
R2323
 
1.0%
J1744
 
0.8%
Space Separator
ValueCountFrequency (%)
669823
100.0%
Dash Punctuation
ValueCountFrequency (%)
-3431
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2908007
81.2%
Common673254
 
18.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e447818
15.4%
n441836
15.2%
i435082
15.0%
o227319
7.8%
r226818
7.8%
t225542
7.8%
s221004
7.6%
v218177
7.5%
N218041
7.5%
u217541
7.5%
Other values (11)28829
 
1.0%
Common
ValueCountFrequency (%)
669823
99.5%
-3431
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII3581261
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
669823
18.7%
e447818
12.5%
n441836
12.3%
i435082
12.1%
o227319
 
6.3%
r226818
 
6.3%
t225542
 
6.3%
s221004
 
6.2%
v218177
 
6.1%
N218041
 
6.1%
Other values (13)249801
 
7.0%

FullOrPartTimeEmploymentStat
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.2 MiB
Children or Armed Forces
139314 
Full-time schedules
45944 
Not in labor force
29933 
PT for non-econ reasons usually FT
 
3770
Unemployed full-time
 
2637
Other values (3)
 
2865

Length

Max length35
Median length25
Mean length23.33659445
Min length19

Characters and Unicode

Total characters5238202
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in labor force
2nd row Children or Armed Forces
3rd row Full-time schedules
4th row Not in labor force
5th row Children or Armed Forces

Common Values

ValueCountFrequency (%)
Children or Armed Forces139314
62.1%
Full-time schedules45944
 
20.5%
Not in labor force29933
 
13.3%
PT for non-econ reasons usually FT3770
 
1.7%
Unemployed full-time2637
 
1.2%
PT for econ reasons usually PT1361
 
0.6%
Unemployed part- time933
 
0.4%
PT for econ reasons usually FT571
 
0.3%

Length

2021-12-30T00:43:49.726638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:49.770703image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
children139314
17.2%
or139314
17.2%
armed139314
17.2%
forces139314
17.2%
full-time48581
 
6.0%
schedules45944
 
5.7%
not29933
 
3.7%
in29933
 
3.7%
labor29933
 
3.7%
force29933
 
3.7%
Other values (10)39648
 
4.9%

Most occurring characters

ValueCountFrequency (%)
811161
15.5%
r629459
12.0%
e607821
11.6%
o392873
 
7.5%
d328142
 
6.3%
l327327
 
6.2%
s248308
 
4.7%
c220893
 
4.2%
i218761
 
4.2%
m192398
 
3.7%
Other values (17)1261059
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3853560
73.6%
Space Separator811161
 
15.5%
Uppercase Letter520197
 
9.9%
Dash Punctuation53284
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r629459
16.3%
e607821
15.8%
o392873
10.2%
d328142
8.5%
l327327
8.5%
s248308
 
6.4%
c220893
 
5.7%
i218761
 
5.7%
m192398
 
5.0%
n191761
 
5.0%
Other values (8)495817
12.9%
Uppercase Letter
ValueCountFrequency (%)
F189599
36.4%
C139314
26.8%
A139314
26.8%
N29933
 
5.8%
T11404
 
2.2%
P7063
 
1.4%
U3570
 
0.7%
Space Separator
ValueCountFrequency (%)
811161
100.0%
Dash Punctuation
ValueCountFrequency (%)
-53284
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4373757
83.5%
Common864445
 
16.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
r629459
14.4%
e607821
13.9%
o392873
 
9.0%
d328142
 
7.5%
l327327
 
7.5%
s248308
 
5.7%
c220893
 
5.1%
i218761
 
5.0%
m192398
 
4.4%
n191761
 
4.4%
Other values (15)1016014
23.2%
Common
ValueCountFrequency (%)
811161
93.8%
-53284
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII5238202
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
811161
15.5%
r629459
12.0%
e607821
11.6%
o392873
 
7.5%
d328142
 
6.3%
l327327
 
6.2%
s248308
 
4.7%
c220893
 
4.2%
i218761
 
4.2%
m192398
 
3.7%
Other values (17)1261059
24.1%

CapitalGains
Real number (ℝ≥0)

ZEROS

Distinct131
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean425.0153967
Minimum0
Maximum99999
Zeros216173
Zeros (%)96.3%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:49.835404image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum99999
Range99999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4607.07145
Coefficient of variation (CV)10.83977542
Kurtosis407.080054
Mean425.0153967
Median Absolute Deviation (MAD)0
Skewness19.2963265
Sum95400231
Variance21225107.34
MonotonicityNot monotonic
2021-12-30T00:43:49.903980image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0216173
96.3%
15024862
 
0.4%
7298693
 
0.3%
7688667
 
0.3%
99999420
 
0.2%
3103253
 
0.1%
5178230
 
0.1%
4386186
 
0.1%
5013160
 
0.1%
10520130
 
0.1%
Other values (121)4689
 
2.1%
ValueCountFrequency (%)
0216173
96.3%
11417
 
< 0.1%
40141
 
< 0.1%
59498
 
< 0.1%
91419
 
< 0.1%
99169
 
< 0.1%
105579
 
< 0.1%
1086112
 
< 0.1%
10901
 
< 0.1%
11117
 
< 0.1%
ValueCountFrequency (%)
99999420
0.2%
413105
 
< 0.1%
3409510
 
< 0.1%
27828110
 
< 0.1%
2523624
 
< 0.1%
2512424
 
< 0.1%
220402
 
< 0.1%
2005182
 
< 0.1%
1848116
 
< 0.1%
1583118
 
< 0.1%

CapitalLosses
Real number (ℝ≥0)

ZEROS

Distinct113
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.51214231
Minimum0
Maximum4608
Zeros220033
Zeros (%)98.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:49.971257image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4608
Range4608
Interquartile range (IQR)0

Descriptive statistics

Standard deviation272.7890742
Coefficient of variation (CV)7.272020669
Kurtosis62.56600846
Mean37.51214231
Median Absolute Deviation (MAD)0
Skewness7.65772334
Sum8420088
Variance74413.87903
MonotonicityNot monotonic
2021-12-30T00:43:50.036971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0220033
98.0%
1902460
 
0.2%
1977456
 
0.2%
1887396
 
0.2%
1602208
 
0.1%
2415130
 
0.1%
1485118
 
0.1%
1848107
 
< 0.1%
187697
 
< 0.1%
167297
 
< 0.1%
Other values (103)2361
 
1.1%
ValueCountFrequency (%)
0220033
98.0%
1552
 
< 0.1%
21313
 
< 0.1%
3238
 
< 0.1%
41940
 
< 0.1%
62530
 
< 0.1%
65310
 
< 0.1%
7726
 
< 0.1%
8107
 
< 0.1%
88010
 
< 0.1%
ValueCountFrequency (%)
46087
 
< 0.1%
435638
< 0.1%
39003
 
< 0.1%
37707
 
< 0.1%
36835
 
< 0.1%
35008
 
< 0.1%
317510
 
< 0.1%
300412
 
< 0.1%
282434
< 0.1%
27885
 
< 0.1%

DividendsFromStocks
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct1555
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean194.5972076
Minimum0
Maximum99999
Zeros200794
Zeros (%)89.5%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:50.104248image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile400
Maximum99999
Range99999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1941.531084
Coefficient of variation (CV)9.977178543
Kurtosis1073.540847
Mean194.5972076
Median Absolute Deviation (MAD)0
Skewness27.45959869
Sum43679873
Variance3769542.95
MonotonicityNot monotonic
2021-12-30T00:43:50.172399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0200794
89.5%
1001262
 
0.6%
5001176
 
0.5%
2001000
 
0.4%
1000983
 
0.4%
50918
 
0.4%
2000624
 
0.3%
150622
 
0.3%
250613
 
0.3%
300596
 
0.3%
Other values (1545)15875
 
7.1%
ValueCountFrequency (%)
0200794
89.5%
1519
 
0.2%
2215
 
0.1%
3138
 
0.1%
481
 
< 0.1%
5189
 
0.1%
6108
 
< 0.1%
793
 
< 0.1%
8105
 
< 0.1%
966
 
< 0.1%
ValueCountFrequency (%)
9999924
< 0.1%
900001
 
< 0.1%
810001
 
< 0.1%
750007
 
< 0.1%
700003
 
< 0.1%
666212
 
< 0.1%
600007
 
< 0.1%
576781
 
< 0.1%
550002
 
< 0.1%
546002
 
< 0.1%

TaxFilerStat
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.1 MiB
Nonfiler
84429 
Joint both under 65
75706 
Single
42203 
Joint both 65+
9416 
Head of household
 
8301

Length

Max length29
Median length9
Mean length13.31128961
Min length7

Characters and Unicode

Total characters2987892
Distinct characters24
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Nonfiler
2nd row Single
3rd row Joint both under 65
4th row Single
5th row Nonfiler

Common Values

ValueCountFrequency (%)
Nonfiler84429
37.6%
Joint both under 6575706
33.7%
Single42203
18.8%
Joint both 65+9416
 
4.2%
Head of household8301
 
3.7%
Joint one under 65 & one 65+4408
 
2.0%

Length

2021-12-30T00:43:50.289163image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:50.327775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
6593938
18.3%
joint89530
17.4%
both85122
16.6%
nonfiler84429
16.4%
under80114
15.6%
single42203
8.2%
one8816
 
1.7%
head8301
 
1.6%
of8301
 
1.6%
household8301
 
1.6%

Most occurring characters

ValueCountFrequency (%)
513463
17.2%
n305092
10.2%
o292800
 
9.8%
e232164
 
7.8%
i216162
 
7.2%
t174652
 
5.8%
r164543
 
5.5%
l134933
 
4.5%
h101724
 
3.4%
d96716
 
3.2%
Other values (14)755643
25.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2043858
68.4%
Space Separator513463
 
17.2%
Uppercase Letter224463
 
7.5%
Decimal Number187876
 
6.3%
Math Symbol13824
 
0.5%
Other Punctuation4408
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n305092
14.9%
o292800
14.3%
e232164
11.4%
i216162
10.6%
t174652
8.5%
r164543
8.1%
l134933
6.6%
h101724
 
5.0%
d96716
 
4.7%
f92730
 
4.5%
Other values (5)232342
11.4%
Uppercase Letter
ValueCountFrequency (%)
J89530
39.9%
N84429
37.6%
S42203
18.8%
H8301
 
3.7%
Decimal Number
ValueCountFrequency (%)
693938
50.0%
593938
50.0%
Space Separator
ValueCountFrequency (%)
513463
100.0%
Math Symbol
ValueCountFrequency (%)
+13824
100.0%
Other Punctuation
ValueCountFrequency (%)
&4408
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2268321
75.9%
Common719571
 
24.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n305092
13.5%
o292800
12.9%
e232164
10.2%
i216162
9.5%
t174652
 
7.7%
r164543
 
7.3%
l134933
 
5.9%
h101724
 
4.5%
d96716
 
4.3%
f92730
 
4.1%
Other values (9)456805
20.1%
Common
ValueCountFrequency (%)
513463
71.4%
693938
 
13.1%
593938
 
13.1%
+13824
 
1.9%
&4408
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII2987892
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
513463
17.2%
n305092
10.2%
o292800
 
9.8%
e232164
 
7.8%
i216162
 
7.2%
t174652
 
5.8%
r164543
 
5.5%
l134933
 
4.5%
h101724
 
3.4%
d96716
 
3.2%
Other values (14)755643
25.3%

RegionOfPreviousResidence
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
Not in universe
206814 
South
 
5480
West
 
4589
Midwest
 
3990
Northeast
 
3037

Length

Max length16
Median length16
Mean length15.28541452
Min length5

Characters and Unicode

Total characters3431010
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row South
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe206814
92.1%
South5480
 
2.4%
West4589
 
2.0%
Midwest3990
 
1.8%
Northeast3037
 
1.4%
Abroad553
 
0.2%

Length

2021-12-30T00:43:50.426312image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:50.466942image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
not206814
32.4%
in206814
32.4%
universe206814
32.4%
south5480
 
0.9%
west4589
 
0.7%
midwest3990
 
0.6%
northeast3037
 
0.5%
abroad553
 
0.1%

Most occurring characters

ValueCountFrequency (%)
638091
18.6%
e425244
12.4%
i417618
12.2%
n413628
12.1%
t226947
 
6.6%
s218430
 
6.4%
o215884
 
6.3%
u212294
 
6.2%
r210404
 
6.1%
N209851
 
6.1%
Other values (10)242619
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2568456
74.9%
Space Separator638091
 
18.6%
Uppercase Letter224463
 
6.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e425244
16.6%
i417618
16.3%
n413628
16.1%
t226947
8.8%
s218430
8.5%
o215884
8.4%
u212294
8.3%
r210404
8.2%
v206814
8.1%
h8517
 
0.3%
Other values (4)12676
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
N209851
93.5%
S5480
 
2.4%
W4589
 
2.0%
M3990
 
1.8%
A553
 
0.2%
Space Separator
ValueCountFrequency (%)
638091
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2792919
81.4%
Common638091
 
18.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e425244
15.2%
i417618
15.0%
n413628
14.8%
t226947
8.1%
s218430
7.8%
o215884
7.7%
u212294
7.6%
r210404
7.5%
N209851
7.5%
v206814
7.4%
Other values (9)35805
 
1.3%
Common
ValueCountFrequency (%)
638091
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3431010
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
638091
18.6%
e425244
12.4%
i417618
12.2%
n413628
12.1%
t226947
 
6.6%
s218430
 
6.4%
o215884
 
6.3%
u212294
 
6.2%
r210404
 
6.1%
N209851
 
6.1%
Other values (10)242619
 
7.1%

StateOfPreviousResidence
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct50
Distinct (%)< 0.1%
Missing794
Missing (%)0.4%
Memory size15.5 MiB
Not in universe
206814 
California
 
1952
Utah
 
1202
Florida
 
956
North Carolina
 
917
Other values (45)
 
11828

Length

Max length21
Median length16
Mean length15.50706177
Min length5

Characters and Unicode

Total characters3468449
Distinct characters45
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Arkansas
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe206814
92.1%
California1952
 
0.9%
Utah1202
 
0.5%
Florida956
 
0.4%
North Carolina917
 
0.4%
Abroad725
 
0.3%
Oklahoma677
 
0.3%
Minnesota665
 
0.3%
Indiana640
 
0.3%
North Dakota523
 
0.2%
Other values (40)8598
 
3.8%
(Missing)794
 
0.4%

Length

2021-12-30T00:43:50.592318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not206814
32.3%
universe206814
32.3%
in206814
32.3%
california1952
 
0.3%
north1440
 
0.2%
utah1202
 
0.2%
new1088
 
0.2%
carolina1029
 
0.2%
florida956
 
0.1%
abroad725
 
0.1%
Other values (45)11754
 
1.8%

Most occurring characters

ValueCountFrequency (%)
640588
18.5%
i427997
12.3%
n424720
12.2%
e420032
12.1%
o219817
 
6.3%
r216147
 
6.2%
s213006
 
6.1%
t212884
 
6.1%
N209749
 
6.0%
u208185
 
6.0%
Other values (35)275324
7.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2601027
75.0%
Space Separator640588
 
18.5%
Uppercase Letter226834
 
6.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i427997
16.5%
n424720
16.3%
e420032
16.1%
o219817
8.5%
r216147
8.3%
s213006
8.2%
t212884
8.2%
u208185
8.0%
v207233
8.0%
a21431
 
0.8%
Other values (14)29575
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
N209749
92.5%
C3490
 
1.5%
M2785
 
1.2%
A1806
 
0.8%
U1202
 
0.5%
O1174
 
0.5%
I1061
 
0.5%
F956
 
0.4%
D901
 
0.4%
W627
 
0.3%
Other values (10)3083
 
1.4%
Space Separator
ValueCountFrequency (%)
640588
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2827861
81.5%
Common640588
 
18.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
i427997
15.1%
n424720
15.0%
e420032
14.9%
o219817
7.8%
r216147
7.6%
s213006
7.5%
t212884
7.5%
N209749
7.4%
u208185
7.4%
v207233
7.3%
Other values (34)68091
 
2.4%
Common
ValueCountFrequency (%)
640588
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3468449
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
640588
18.5%
i427997
12.3%
n424720
12.2%
e420032
12.1%
o219817
 
6.3%
r216147
 
6.2%
s213006
 
6.1%
t212884
 
6.1%
N209749
 
6.0%
u208185
 
6.0%
Other values (35)275324
7.9%

DetailedHholdAndFamStat
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct38
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.7 MiB
Householder
59869 
Child <18 never marr not in subfamily
56542 
Spouse of householder
46838 
Nonfamily householder
25087 
Child 18+ never marr Not in a subfamily
13586 
Other values (33)
22541 

Length

Max length48
Median length22
Mean length25.71041107
Min length12

Characters and Unicode

Total characters5771036
Distinct characters35
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Householder
2nd row Secondary individual
3rd row Householder
4th row Secondary individual
5th row Child 18+ never marr Not in a subfamily

Common Values

ValueCountFrequency (%)
Householder59869
26.7%
Child <18 never marr not in subfamily56542
25.2%
Spouse of householder46838
20.9%
Nonfamily householder25087
11.2%
Child 18+ never marr Not in a subfamily13586
 
6.1%
Secondary individual6938
 
3.1%
Other Rel 18+ ever marr not in subfamily2221
 
1.0%
Grandchild <18 never marr child of subfamily RP2062
 
0.9%
Other Rel 18+ never marr not in subfamily1927
 
0.9%
Grandchild <18 never marr not in subfamily1183
 
0.5%
Other values (28)8210
 
3.7%

Length

2021-12-30T00:43:50.729552image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
householder131794
14.9%
subfamily85520
9.7%
1884687
9.6%
marr82929
9.4%
never78011
8.8%
in77977
8.8%
not77766
8.8%
child76610
8.6%
of55485
6.3%
spouse47824
 
5.4%
Other values (15)87275
9.9%

Most occurring characters

ValueCountFrequency (%)
885878
15.4%
e502008
 
8.7%
o476899
 
8.3%
r401527
 
7.0%
l338479
 
5.9%
h291196
 
5.0%
i289545
 
5.0%
u274991
 
4.8%
s266272
 
4.6%
n264229
 
4.6%
Other values (25)1780012
30.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4370878
75.7%
Space Separator885878
 
15.4%
Uppercase Letter261049
 
4.5%
Decimal Number169374
 
2.9%
Math Symbol83857
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e502008
11.5%
o476899
10.9%
r401527
 
9.2%
l338479
 
7.7%
h291196
 
6.7%
i289545
 
6.6%
u274991
 
6.3%
s266272
 
6.1%
n264229
 
6.0%
d238372
 
5.5%
Other values (11)1027360
23.5%
Uppercase Letter
ValueCountFrequency (%)
C73820
28.3%
H59869
22.9%
S53839
20.6%
N39811
15.3%
R14878
 
5.7%
P7754
 
3.0%
O7124
 
2.7%
G3743
 
1.4%
I211
 
0.1%
Decimal Number
ValueCountFrequency (%)
184687
50.0%
884687
50.0%
Math Symbol
ValueCountFrequency (%)
<61348
73.2%
+22509
 
26.8%
Space Separator
ValueCountFrequency (%)
885878
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4631927
80.3%
Common1139109
 
19.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e502008
 
10.8%
o476899
 
10.3%
r401527
 
8.7%
l338479
 
7.3%
h291196
 
6.3%
i289545
 
6.3%
u274991
 
5.9%
s266272
 
5.7%
n264229
 
5.7%
d238372
 
5.1%
Other values (20)1288409
27.8%
Common
ValueCountFrequency (%)
885878
77.8%
184687
 
7.4%
884687
 
7.4%
<61348
 
5.4%
+22509
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5771036
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
885878
15.4%
e502008
 
8.7%
o476899
 
8.3%
r401527
 
7.0%
l338479
 
5.9%
h291196
 
5.0%
i289545
 
5.0%
u274991
 
4.8%
s266272
 
4.6%
n264229
 
4.6%
Other values (25)1780012
30.8%

DetailedHholdSumInHhold
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.5 MiB
Householder
84976 
Child under 18 never married
56657 
Spouse of householder
46851 
Child 18 or older
16307 
Other relative of householder
10873 
Other values (3)
8799 

Length

Max length37
Median length22
Mean length20.28046493
Min length12

Characters and Unicode

Total characters4552214
Distinct characters29
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Householder
2nd row Nonrelative of householder
3rd row Householder
4th row Nonrelative of householder
5th row Child 18 or older

Common Values

ValueCountFrequency (%)
Householder84976
37.9%
Child under 18 never married56657
25.2%
Spouse of householder46851
20.9%
Child 18 or older16307
 
7.3%
Other relative of householder10873
 
4.8%
Nonrelative of householder8612
 
3.8%
Group Quarters- Secondary individual139
 
0.1%
Child under 18 ever married48
 
< 0.1%

Length

2021-12-30T00:43:50.841824image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:50.883678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
householder151312
23.5%
child73012
11.3%
1873012
11.3%
of66336
10.3%
under56705
 
8.8%
married56705
 
8.8%
never56657
 
8.8%
spouse46851
 
7.3%
older16307
 
2.5%
or16307
 
2.5%
Other values (8)30962
 
4.8%

Most occurring characters

ValueCountFrequency (%)
644166
14.2%
e642723
14.1%
o457315
10.0%
r441660
9.7%
d354458
7.8%
h301533
 
6.6%
l260255
 
5.7%
u255285
 
5.6%
s198302
 
4.4%
i149619
 
3.3%
Other values (19)846898
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3537144
77.7%
Space Separator644166
 
14.2%
Uppercase Letter224741
 
4.9%
Decimal Number146024
 
3.2%
Dash Punctuation139
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e642723
18.2%
o457315
12.9%
r441660
12.5%
d354458
10.0%
h301533
8.5%
l260255
7.4%
u255285
 
7.2%
s198302
 
5.6%
i149619
 
4.2%
n122252
 
3.5%
Other values (8)353742
10.0%
Uppercase Letter
ValueCountFrequency (%)
H84976
37.8%
C73012
32.5%
S46990
20.9%
O10873
 
4.8%
N8612
 
3.8%
G139
 
0.1%
Q139
 
0.1%
Decimal Number
ValueCountFrequency (%)
173012
50.0%
873012
50.0%
Space Separator
ValueCountFrequency (%)
644166
100.0%
Dash Punctuation
ValueCountFrequency (%)
-139
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3761885
82.6%
Common790329
 
17.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e642723
17.1%
o457315
12.2%
r441660
11.7%
d354458
9.4%
h301533
8.0%
l260255
6.9%
u255285
 
6.8%
s198302
 
5.3%
i149619
 
4.0%
n122252
 
3.2%
Other values (15)578483
15.4%
Common
ValueCountFrequency (%)
644166
81.5%
173012
 
9.2%
873012
 
9.2%
-139
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII4552214
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
644166
14.2%
e642723
14.1%
o457315
10.0%
r441660
9.7%
d354458
7.8%
h301533
 
6.6%
l260255
 
5.7%
u255285
 
5.6%
s198302
 
4.4%
i149619
 
3.3%
Other values (19)846898
18.6%

InstanceWeight
Real number (ℝ≥0)

Distinct106655
Distinct (%)47.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1738.935602
Minimum37.87
Maximum18656.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:50.948121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum37.87
5-th percentile394.533
Q11060.68
median1616.7
Q32187.235
95-th percentile3580.903
Maximum18656.3
Range18618.43
Interquartile range (IQR)1126.555

Descriptive statistics

Standard deviation992.7424831
Coefficient of variation (CV)0.5708908839
Kurtosis5.680207624
Mean1738.935602
Median Absolute Deviation (MAD)561.22
Skewness1.448237458
Sum390326702.1
Variance985537.6377
MonotonicityNot monotonic
2021-12-30T00:43:51.017839image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
707.944
 
< 0.1%
1378.7137
 
< 0.1%
1070.1534
 
< 0.1%
1888.1332
 
< 0.1%
1155.232
 
< 0.1%
1362.1632
 
< 0.1%
753.2332
 
< 0.1%
1839.1931
 
< 0.1%
1194.5930
 
< 0.1%
1386.3830
 
< 0.1%
Other values (106645)224129
99.9%
ValueCountFrequency (%)
37.871
 
< 0.1%
39.111
 
< 0.1%
40.672
< 0.1%
42.822
< 0.1%
43.264
< 0.1%
45.741
 
< 0.1%
47.834
< 0.1%
49.821
 
< 0.1%
50.461
 
< 0.1%
52.431
 
< 0.1%
ValueCountFrequency (%)
18656.31
< 0.1%
16349.21
< 0.1%
16258.21
< 0.1%
13911.51
< 0.1%
13388.61
< 0.1%
13145.12
< 0.1%
13114.21
< 0.1%
12960.22
< 0.1%
12739.21
< 0.1%
12554.31
< 0.1%

MigCodeChangeInMsa
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9
Distinct (%)< 0.1%
Missing112154
Missing (%)50.0%
Memory size10.6 MiB
Nonmover
92988 
MSA to MSA
11940 
NonMSA to nonMSA
 
3116
Not in universe
 
1672
MSA to nonMSA
 
867
Other values (4)
 
1726

Length

Max length17
Median length9
Mean length9.669705901
Min length9

Characters and Unicode

Total characters1085995
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row MSA to MSA
2nd row Nonmover
3rd row Nonmover
4th row Nonmover
5th row Nonmover

Common Values

ValueCountFrequency (%)
Nonmover92988
41.4%
MSA to MSA11940
 
5.3%
NonMSA to nonMSA3116
 
1.4%
Not in universe1672
 
0.7%
MSA to nonMSA867
 
0.4%
NonMSA to MSA680
 
0.3%
Not identifiable495
 
0.2%
Abroad to MSA467
 
0.2%
Abroad to nonMSA84
 
< 0.1%
(Missing)112154
50.0%

Length

2021-12-30T00:43:51.389698image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:51.433364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
nonmover92988
61.8%
msa25894
 
17.2%
to17154
 
11.4%
nonmsa7863
 
5.2%
not2167
 
1.4%
in1672
 
1.1%
universe1672
 
1.1%
abroad551
 
0.4%
identifiable495
 
0.3%

Most occurring characters

ValueCountFrequency (%)
o213711
19.7%
150456
13.9%
n108757
10.0%
N98951
9.1%
e97322
9.0%
r95211
8.8%
v94660
8.7%
m92988
8.6%
A34308
 
3.2%
M33757
 
3.1%
Other values (10)65874
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter734766
67.7%
Uppercase Letter200773
 
18.5%
Space Separator150456
 
13.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o213711
29.1%
n108757
14.8%
e97322
13.2%
r95211
13.0%
v94660
12.9%
m92988
12.7%
t19816
 
2.7%
i4829
 
0.7%
u1672
 
0.2%
s1672
 
0.2%
Other values (5)4128
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
N98951
49.3%
A34308
 
17.1%
M33757
 
16.8%
S33757
 
16.8%
Space Separator
ValueCountFrequency (%)
150456
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin935539
86.1%
Common150456
 
13.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o213711
22.8%
n108757
11.6%
N98951
10.6%
e97322
10.4%
r95211
10.2%
v94660
10.1%
m92988
9.9%
A34308
 
3.7%
M33757
 
3.6%
S33757
 
3.6%
Other values (9)32117
 
3.4%
Common
ValueCountFrequency (%)
150456
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1085995
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o213711
19.7%
150456
13.9%
n108757
10.0%
N98951
9.1%
e97322
9.0%
r95211
8.8%
v94660
8.7%
m92988
8.6%
A34308
 
3.2%
M33757
 
3.1%
Other values (10)65874
 
6.1%

MigCodeChangeInReg
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct8
Distinct (%)< 0.1%
Missing112154
Missing (%)50.0%
Memory size10.6 MiB
Nonmover
92988 
Same county
11011 
Different county same state
 
3136
Not in universe
 
1672
Different region
 
1319
Other values (3)
 
2183

Length

Max length31
Median length9
Mean length10.32235173
Min length7

Characters and Unicode

Total characters1159293
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Same county
2nd row Nonmover
3rd row Nonmover
4th row Nonmover
5th row Nonmover

Common Values

ValueCountFrequency (%)
Nonmover92988
41.4%
Same county11011
 
4.9%
Different county same state3136
 
1.4%
Not in universe1672
 
0.7%
Different region1319
 
0.6%
Different state same division1115
 
0.5%
Abroad553
 
0.2%
Different division same region515
 
0.2%
(Missing)112154
50.0%

Length

2021-12-30T00:43:51.541242image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:51.580202image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
nonmover92988
65.4%
same15777
 
11.1%
county14147
 
9.9%
different6085
 
4.3%
state4251
 
3.0%
region1834
 
1.3%
not1672
 
1.2%
in1672
 
1.2%
universe1672
 
1.2%
division1630
 
1.1%

Most occurring characters

ValueCountFrequency (%)
o205812
17.8%
142281
12.3%
e130364
11.2%
n120028
10.4%
m108765
9.4%
r103132
8.9%
v96290
8.3%
N94660
8.2%
t30406
 
2.6%
a20581
 
1.8%
Other values (12)106974
9.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter904703
78.0%
Space Separator142281
 
12.3%
Uppercase Letter112309
 
9.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o205812
22.7%
e130364
14.4%
n120028
13.3%
m108765
12.0%
r103132
11.4%
v96290
10.6%
t30406
 
3.4%
a20581
 
2.3%
i16153
 
1.8%
u15819
 
1.7%
Other values (7)57353
 
6.3%
Uppercase Letter
ValueCountFrequency (%)
N94660
84.3%
S11011
 
9.8%
D6085
 
5.4%
A553
 
0.5%
Space Separator
ValueCountFrequency (%)
142281
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1017012
87.7%
Common142281
 
12.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o205812
20.2%
e130364
12.8%
n120028
11.8%
m108765
10.7%
r103132
10.1%
v96290
9.5%
N94660
9.3%
t30406
 
3.0%
a20581
 
2.0%
i16153
 
1.6%
Other values (11)90821
8.9%
Common
ValueCountFrequency (%)
142281
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1159293
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o205812
17.8%
142281
12.3%
e130364
11.2%
n120028
10.4%
m108765
9.4%
r103132
8.9%
v96290
8.3%
N94660
8.2%
t30406
 
2.6%
a20581
 
1.8%
Other values (12)106974
9.2%

MigCodeMoveWithinReg
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9
Distinct (%)< 0.1%
Missing112154
Missing (%)50.0%
Memory size10.6 MiB
Nonmover
92988 
Same county
11011 
Different county same state
 
3136
Not in universe
 
1672
Different state in South
 
1092
Other values (4)
 
2410

Length

Max length29
Median length9
Mean length10.36057662
Min length7

Characters and Unicode

Total characters1163586
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Same county
2nd row Nonmover
3rd row Nonmover
4th row Nonmover
5th row Nonmover

Common Values

ValueCountFrequency (%)
Nonmover92988
41.4%
Same county11011
 
4.9%
Different county same state3136
 
1.4%
Not in universe1672
 
0.7%
Different state in South1092
 
0.5%
Different state in West760
 
0.3%
Different state in Midwest611
 
0.3%
Abroad553
 
0.2%
Different state in Northeast486
 
0.2%
(Missing)112154
50.0%

Length

2021-12-30T00:43:51.687360image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:51.727363image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
nonmover92988
64.2%
same14147
 
9.8%
county14147
 
9.8%
different6085
 
4.2%
state6085
 
4.2%
in4621
 
3.2%
not1672
 
1.2%
universe1672
 
1.2%
south1092
 
0.8%
west760
 
0.5%
Other values (3)1650
 
1.1%

Most occurring characters

ValueCountFrequency (%)
o203926
17.5%
144919
12.5%
e130591
11.2%
n119513
10.3%
m107135
9.2%
r101784
8.7%
N95146
8.2%
v94660
8.1%
t37509
 
3.2%
a21271
 
1.8%
Other values (15)107132
9.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter903409
77.6%
Space Separator144919
 
12.5%
Uppercase Letter115258
 
9.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o203926
22.6%
e130591
14.5%
n119513
13.2%
m107135
11.9%
r101784
11.3%
v94660
10.5%
t37509
 
4.2%
a21271
 
2.4%
u16911
 
1.9%
c14147
 
1.6%
Other values (8)55962
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
N95146
82.6%
S12103
 
10.5%
D6085
 
5.3%
W760
 
0.7%
M611
 
0.5%
A553
 
0.5%
Space Separator
ValueCountFrequency (%)
144919
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1018667
87.5%
Common144919
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o203926
20.0%
e130591
12.8%
n119513
11.7%
m107135
10.5%
r101784
10.0%
N95146
9.3%
v94660
9.3%
t37509
 
3.7%
a21271
 
2.1%
u16911
 
1.7%
Other values (14)90221
8.9%
Common
ValueCountFrequency (%)
144919
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1163586
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o203926
17.5%
144919
12.5%
e130591
11.2%
n119513
10.3%
m107135
9.2%
r101784
8.7%
N95146
8.2%
v94660
8.1%
t37509
 
3.2%
a21271
 
1.8%
Other values (15)107132
9.2%

LiveInThisHouse1YearAgo
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.2 MiB
Not in universe under 1 year old
113826 
Yes
92988 
No
17649 

Length

Max length33
Median length33
Mean length18.62737734
Min length3

Characters and Unicode

Total characters4181157
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe under 1 year old
2nd row No
3rd row Not in universe under 1 year old
4th row Not in universe under 1 year old
5th row Yes

Common Values

ValueCountFrequency (%)
Not in universe under 1 year old113826
50.7%
Yes92988
41.4%
No17649
 
7.9%

Length

2021-12-30T00:43:51.836346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:51.872676image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
not113826
12.5%
in113826
12.5%
universe113826
12.5%
under113826
12.5%
1113826
12.5%
year113826
12.5%
old113826
12.5%
yes92988
10.2%
no17649
 
1.9%

Most occurring characters

ValueCountFrequency (%)
907419
21.7%
e548292
13.1%
n341478
 
8.2%
r341478
 
8.2%
o245301
 
5.9%
i227652
 
5.4%
u227652
 
5.4%
d227652
 
5.4%
s206814
 
4.9%
N131475
 
3.1%
Other values (7)775944
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2935449
70.2%
Space Separator907419
 
21.7%
Uppercase Letter224463
 
5.4%
Decimal Number113826
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e548292
18.7%
n341478
11.6%
r341478
11.6%
o245301
8.4%
i227652
7.8%
u227652
7.8%
d227652
7.8%
s206814
 
7.0%
t113826
 
3.9%
v113826
 
3.9%
Other values (3)341478
11.6%
Uppercase Letter
ValueCountFrequency (%)
N131475
58.6%
Y92988
41.4%
Space Separator
ValueCountFrequency (%)
907419
100.0%
Decimal Number
ValueCountFrequency (%)
1113826
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3159912
75.6%
Common1021245
 
24.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e548292
17.4%
n341478
10.8%
r341478
10.8%
o245301
7.8%
i227652
7.2%
u227652
7.2%
d227652
7.2%
s206814
 
6.5%
N131475
 
4.2%
t113826
 
3.6%
Other values (5)548292
17.4%
Common
ValueCountFrequency (%)
907419
88.9%
1113826
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII4181157
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
907419
21.7%
e548292
13.1%
n341478
 
8.2%
r341478
 
8.2%
o245301
 
5.9%
i227652
 
5.4%
u227652
 
5.4%
d227652
 
5.4%
s206814
 
4.9%
N131475
 
3.1%
Other values (7)775944
18.6%

MigPrevResInSunbelt
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing112154
Missing (%)50.0%
Memory size11.0 MiB
Not in universe
94660 
No
11128 
Yes
 
6521

Length

Max length16
Median length16
Mean length14.01515462
Min length3

Characters and Unicode

Total characters1574028
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Yes
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe94660
42.2%
No11128
 
5.0%
Yes6521
 
2.9%
(Missing)112154
50.0%

Length

2021-12-30T00:43:51.960915image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:51.997244image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
not94660
31.4%
in94660
31.4%
universe94660
31.4%
no11128
 
3.7%
yes6521
 
2.2%

Most occurring characters

ValueCountFrequency (%)
301629
19.2%
e195841
12.4%
i189320
12.0%
n189320
12.0%
N105788
 
6.7%
o105788
 
6.7%
s101181
 
6.4%
t94660
 
6.0%
u94660
 
6.0%
v94660
 
6.0%
Other values (2)101181
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1160090
73.7%
Space Separator301629
 
19.2%
Uppercase Letter112309
 
7.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e195841
16.9%
i189320
16.3%
n189320
16.3%
o105788
9.1%
s101181
8.7%
t94660
8.2%
u94660
8.2%
v94660
8.2%
r94660
8.2%
Uppercase Letter
ValueCountFrequency (%)
N105788
94.2%
Y6521
 
5.8%
Space Separator
ValueCountFrequency (%)
301629
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1272399
80.8%
Common301629
 
19.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e195841
15.4%
i189320
14.9%
n189320
14.9%
N105788
8.3%
o105788
8.3%
s101181
8.0%
t94660
7.4%
u94660
7.4%
v94660
7.4%
r94660
7.4%
Common
ValueCountFrequency (%)
301629
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1574028
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
301629
19.2%
e195841
12.4%
i189320
12.0%
n189320
12.0%
N105788
 
6.7%
o105788
 
6.7%
s101181
 
6.4%
t94660
 
6.0%
u94660
 
6.0%
v94660
 
6.0%
Other values (2)101181
 
6.4%

NumOfPersonsWorkForEmployer
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.956059573
Minimum0
Maximum6
Zeros107852
Zeros (%)48.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:52.029510image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q34
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.364070318
Coefficient of variation (CV)1.208588097
Kurtosis-1.080018797
Mean1.956059573
Median Absolute Deviation (MAD)1
Skewness0.7521851699
Sum439063
Variance5.588828468
MonotonicityNot monotonic
2021-12-30T00:43:52.072957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0107852
48.0%
641072
 
18.3%
126115
 
11.6%
416146
 
7.2%
315237
 
6.8%
211328
 
5.0%
56713
 
3.0%
ValueCountFrequency (%)
0107852
48.0%
126115
 
11.6%
211328
 
5.0%
315237
 
6.8%
416146
 
7.2%
56713
 
3.0%
641072
 
18.3%
ValueCountFrequency (%)
641072
 
18.3%
56713
 
3.0%
416146
 
7.2%
315237
 
6.8%
211328
 
5.0%
126115
 
11.6%
0107852
48.0%

FamilyMembersUnder18
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.9 MiB
Not in universe
162391 
Both parents present
43808 
Mother only present
 
14287
Father only present
 
2124
Neither parent present
 
1853

Length

Max length23
Median length16
Mean length17.32607601
Min length16

Characters and Unicode

Total characters3889063
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe162391
72.3%
Both parents present43808
 
19.5%
Mother only present14287
 
6.4%
Father only present2124
 
0.9%
Neither parent present1853
 
0.8%

Length

2021-12-30T00:43:52.188021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:52.230319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
not162391
24.1%
in162391
24.1%
universe162391
24.1%
present62072
 
9.2%
both43808
 
6.5%
parents43808
 
6.5%
only16411
 
2.4%
mother14287
 
2.1%
father2124
 
0.3%
neither1853
 
0.3%

Most occurring characters

ValueCountFrequency (%)
673389
17.3%
e514704
13.2%
n448926
11.5%
t332196
8.5%
i326635
8.4%
r288388
7.4%
s268271
 
6.9%
o236897
 
6.1%
N164244
 
4.2%
u162391
 
4.2%
Other values (9)473022
12.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2991211
76.9%
Space Separator673389
 
17.3%
Uppercase Letter224463
 
5.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e514704
17.2%
n448926
15.0%
t332196
11.1%
i326635
10.9%
r288388
9.6%
s268271
9.0%
o236897
7.9%
u162391
 
5.4%
v162391
 
5.4%
p107733
 
3.6%
Other values (4)142679
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
N164244
73.2%
B43808
 
19.5%
M14287
 
6.4%
F2124
 
0.9%
Space Separator
ValueCountFrequency (%)
673389
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3215674
82.7%
Common673389
 
17.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e514704
16.0%
n448926
14.0%
t332196
10.3%
i326635
10.2%
r288388
9.0%
s268271
8.3%
o236897
7.4%
N164244
 
5.1%
u162391
 
5.0%
v162391
 
5.0%
Other values (8)310631
9.7%
Common
ValueCountFrequency (%)
673389
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3889063
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
673389
17.3%
e514704
13.2%
n448926
11.5%
t332196
8.5%
i326635
8.4%
r288388
7.4%
s268271
 
6.9%
o236897
 
6.1%
N164244
 
4.2%
u162391
 
4.2%
Other values (9)473022
12.2%

CntryOfBirthFather
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct42
Distinct (%)< 0.1%
Missing7498
Missing (%)3.3%
Memory size14.7 MiB
United-States
178991 
Mexico
 
11307
Puerto-Rico
 
2944
Italy
 
2470
Germany
 
1541
Other values (37)
19712 

Length

Max length29
Median length14
Mean length13.03709354
Min length5

Characters and Unicode

Total characters2828593
Distinct characters46
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row United-States
2nd row United-States
3rd row United-States
4th row United-States
5th row United-States

Common Values

ValueCountFrequency (%)
United-States178991
79.7%
Mexico11307
 
5.0%
Puerto-Rico2944
 
1.3%
Italy2470
 
1.1%
Germany1541
 
0.7%
Canada1525
 
0.7%
Dominican-Republic1495
 
0.7%
Poland1377
 
0.6%
Cuba1301
 
0.6%
Philippines1284
 
0.6%
Other values (32)12730
 
5.7%
(Missing)7498
 
3.3%

Length

2021-12-30T00:43:52.358438image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
united-states178991
82.0%
mexico11307
 
5.2%
puerto-rico2944
 
1.3%
italy2470
 
1.1%
germany1541
 
0.7%
canada1525
 
0.7%
dominican-republic1495
 
0.7%
poland1377
 
0.6%
cuba1301
 
0.6%
philippines1284
 
0.6%
Other values (38)14142
 
6.5%

Most occurring characters

ValueCountFrequency (%)
t545472
19.3%
e380859
13.5%
218377
7.7%
a209117
 
7.4%
i207148
 
7.3%
n195080
 
6.9%
d186795
 
6.6%
-184766
 
6.5%
S181300
 
6.4%
s180999
 
6.4%
Other values (36)338680
12.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2021321
71.5%
Uppercase Letter403636
 
14.3%
Space Separator218377
 
7.7%
Dash Punctuation184766
 
6.5%
Open Punctuation178
 
< 0.1%
Close Punctuation178
 
< 0.1%
Other Punctuation137
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t545472
27.0%
e380859
18.8%
a209117
 
10.3%
i207148
 
10.2%
n195080
 
9.7%
d186795
 
9.2%
s180999
 
9.0%
o25586
 
1.3%
c19628
 
1.0%
l12896
 
0.6%
Other values (11)57741
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
S181300
44.9%
U179347
44.4%
M11307
 
2.8%
P6457
 
1.6%
C4682
 
1.2%
R4439
 
1.1%
I4178
 
1.0%
G2624
 
0.7%
E2428
 
0.6%
D1495
 
0.4%
Other values (10)5379
 
1.3%
Space Separator
ValueCountFrequency (%)
218377
100.0%
Dash Punctuation
ValueCountFrequency (%)
-184766
100.0%
Open Punctuation
ValueCountFrequency (%)
(178
100.0%
Close Punctuation
ValueCountFrequency (%)
)178
100.0%
Other Punctuation
ValueCountFrequency (%)
&137
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2424957
85.7%
Common403636
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t545472
22.5%
e380859
15.7%
a209117
 
8.6%
i207148
 
8.5%
n195080
 
8.0%
d186795
 
7.7%
S181300
 
7.5%
s180999
 
7.5%
U179347
 
7.4%
o25586
 
1.1%
Other values (31)133254
 
5.5%
Common
ValueCountFrequency (%)
218377
54.1%
-184766
45.8%
(178
 
< 0.1%
)178
 
< 0.1%
&137
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2828593
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t545472
19.3%
e380859
13.5%
218377
7.7%
a209117
 
7.4%
i207148
 
7.3%
n195080
 
6.9%
d186795
 
6.6%
-184766
 
6.5%
S181300
 
6.4%
s180999
 
6.4%
Other values (36)338680
12.0%

CntryOfBirthMother
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct42
Distinct (%)< 0.1%
Missing6843
Missing (%)3.0%
Memory size14.7 MiB
United-States
180431 
Mexico
 
11095
Puerto-Rico
 
2733
Italy
 
2050
Canada
 
1588
Other values (37)
19723 

Length

Max length29
Median length14
Mean length13.05336826
Min length5

Characters and Unicode

Total characters2840674
Distinct characters46
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row United-States
2nd row United-States
3rd row United-States
4th row United-States
5th row United-States

Common Values

ValueCountFrequency (%)
United-States180431
80.4%
Mexico11095
 
4.9%
Puerto-Rico2733
 
1.2%
Italy2050
 
0.9%
Canada1588
 
0.7%
Germany1566
 
0.7%
Philippines1380
 
0.6%
Cuba1302
 
0.6%
Poland1240
 
0.6%
Dominican-Republic1235
 
0.6%
Other values (32)13000
 
5.8%
(Missing)6843
 
3.0%

Length

2021-12-30T00:43:52.496945image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
united-states180431
82.4%
mexico11095
 
5.1%
puerto-rico2733
 
1.2%
italy2050
 
0.9%
canada1588
 
0.7%
germany1566
 
0.7%
philippines1380
 
0.6%
cuba1302
 
0.6%
poland1240
 
0.6%
dominican-republic1235
 
0.6%
Other values (38)14467
 
6.6%

Most occurring characters

ValueCountFrequency (%)
t549206
19.3%
e383099
13.5%
219087
 
7.7%
a210476
 
7.4%
i207577
 
7.3%
n196445
 
6.9%
d188477
 
6.6%
-185842
 
6.5%
S182933
 
6.4%
s182509
 
6.4%
Other values (36)335023
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2029886
71.5%
Uppercase Letter405394
 
14.3%
Space Separator219087
 
7.7%
Dash Punctuation185842
 
6.5%
Open Punctuation170
 
< 0.1%
Close Punctuation170
 
< 0.1%
Other Punctuation125
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t549206
27.1%
e383099
18.9%
a210476
 
10.4%
i207577
 
10.2%
n196445
 
9.7%
d188477
 
9.3%
s182509
 
9.0%
o24715
 
1.2%
c18591
 
0.9%
l12520
 
0.6%
Other values (11)56271
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
S182933
45.1%
U180771
44.6%
M11095
 
2.7%
P6172
 
1.5%
C4614
 
1.1%
R3968
 
1.0%
I3809
 
0.9%
E2652
 
0.7%
G2541
 
0.6%
D1235
 
0.3%
Other values (10)5604
 
1.4%
Space Separator
ValueCountFrequency (%)
219087
100.0%
Dash Punctuation
ValueCountFrequency (%)
-185842
100.0%
Other Punctuation
ValueCountFrequency (%)
&125
100.0%
Open Punctuation
ValueCountFrequency (%)
(170
100.0%
Close Punctuation
ValueCountFrequency (%)
)170
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2435280
85.7%
Common405394
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t549206
22.6%
e383099
15.7%
a210476
 
8.6%
i207577
 
8.5%
n196445
 
8.1%
d188477
 
7.7%
S182933
 
7.5%
s182509
 
7.5%
U180771
 
7.4%
o24715
 
1.0%
Other values (31)129072
 
5.3%
Common
ValueCountFrequency (%)
219087
54.0%
-185842
45.8%
(170
 
< 0.1%
)170
 
< 0.1%
&125
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2840674
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t549206
19.3%
e383099
13.5%
219087
 
7.7%
a210476
 
7.4%
i207577
 
7.3%
n196445
 
6.9%
d188477
 
6.6%
-185842
 
6.5%
S182933
 
6.4%
s182509
 
6.4%
Other values (36)335023
11.8%

CntryOfBirthSelf
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct42
Distinct (%)< 0.1%
Missing3869
Missing (%)1.7%
Memory size14.9 MiB
United-States
198959 
Mexico
 
6583
Puerto-Rico
 
1547
Cuba
 
971
Germany
 
962
Other values (37)
 
11572

Length

Max length29
Median length14
Mean length13.47078796
Min length5

Characters and Unicode

Total characters2971575
Distinct characters46
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row United-States
2nd row United-States
3rd row United-States
4th row United-States
5th row United-States

Common Values

ValueCountFrequency (%)
United-States198959
88.6%
Mexico6583
 
2.9%
Puerto-Rico1547
 
0.7%
Cuba971
 
0.4%
Germany962
 
0.4%
Philippines950
 
0.4%
Dominican-Republic776
 
0.3%
El-Salvador766
 
0.3%
Canada742
 
0.3%
China562
 
0.3%
Other values (32)7776
 
3.5%
(Missing)3869
 
1.7%

Length

2021-12-30T00:43:52.638465image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
united-states198959
89.7%
mexico6583
 
3.0%
puerto-rico1547
 
0.7%
cuba971
 
0.4%
germany962
 
0.4%
philippines950
 
0.4%
dominican-republic776
 
0.3%
el-salvador766
 
0.3%
canada742
 
0.3%
china562
 
0.3%
Other values (38)8918
 
4.0%

Most occurring characters

ValueCountFrequency (%)
t601053
20.2%
e411348
13.8%
221736
 
7.5%
a216415
 
7.3%
i216083
 
7.3%
n208146
 
7.0%
d203011
 
6.8%
-202199
 
6.8%
S200580
 
6.7%
s200317
 
6.7%
Other values (36)290687
9.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2123021
71.4%
Uppercase Letter424277
 
14.3%
Space Separator221736
 
7.5%
Dash Punctuation202199
 
6.8%
Open Punctuation125
 
< 0.1%
Close Punctuation125
 
< 0.1%
Other Punctuation92
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t601053
28.3%
e411348
19.4%
a216415
 
10.2%
i216083
 
10.2%
n208146
 
9.8%
d203011
 
9.6%
s200317
 
9.4%
o14665
 
0.7%
c11123
 
0.5%
x6583
 
0.3%
Other values (11)34277
 
1.6%
Uppercase Letter
ValueCountFrequency (%)
S200580
47.3%
U199209
47.0%
M6583
 
1.6%
P3429
 
0.8%
C2865
 
0.7%
R2323
 
0.5%
G1635
 
0.4%
E1578
 
0.4%
I1396
 
0.3%
D776
 
0.2%
Other values (10)3903
 
0.9%
Space Separator
ValueCountFrequency (%)
221736
100.0%
Dash Punctuation
ValueCountFrequency (%)
-202199
100.0%
Other Punctuation
ValueCountFrequency (%)
&92
100.0%
Open Punctuation
ValueCountFrequency (%)
(125
100.0%
Close Punctuation
ValueCountFrequency (%)
)125
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2547298
85.7%
Common424277
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t601053
23.6%
e411348
16.1%
a216415
 
8.5%
i216083
 
8.5%
n208146
 
8.2%
d203011
 
8.0%
S200580
 
7.9%
s200317
 
7.9%
U199209
 
7.8%
o14665
 
0.6%
Other values (31)76471
 
3.0%
Common
ValueCountFrequency (%)
221736
52.3%
-202199
47.7%
(125
 
< 0.1%
)125
 
< 0.1%
&92
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2971575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t601053
20.2%
e411348
13.8%
221736
 
7.5%
a216415
 
7.3%
i216083
 
7.3%
n208146
 
7.0%
d203011
 
6.8%
-202199
 
6.8%
S200580
 
6.7%
s200317
 
6.7%
Other values (36)290687
9.8%

Citizenship
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.6 MiB
Native- Born in the United States
198961 
Foreign born- Not a citizen of U S
 
15084
Foreign born- U S citizen by naturalization
 
6704
Native- Born abroad of American Parent(s)
 
2042
Native- Born in Puerto Rico or U S Outlying
 
1672

Length

Max length44
Median length34
Mean length34.58033618
Min length34

Characters and Unicode

Total characters7762006
Distinct characters33
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Native- Born in the United States
2nd row Native- Born in the United States
3rd row Native- Born in the United States
4th row Native- Born in the United States
5th row Native- Born in the United States

Common Values

ValueCountFrequency (%)
Native- Born in the United States198961
88.6%
Foreign born- Not a citizen of U S 15084
 
6.7%
Foreign born- U S citizen by naturalization6704
 
3.0%
Native- Born abroad of American Parent(s)2042
 
0.9%
Native- Born in Puerto Rico or U S Outlying1672
 
0.7%

Length

2021-12-30T00:43:52.758305image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:52.798537image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
born224463
16.2%
native202675
14.6%
in200633
14.4%
the198961
14.3%
united198961
14.3%
states198961
14.3%
u23460
 
1.7%
s23460
 
1.7%
foreign21788
 
1.6%
citizen21788
 
1.6%
Other values (12)73516
 
5.3%

Most occurring characters

ValueCountFrequency (%)
1403750
18.1%
t1054185
13.6%
e848890
10.9%
n686797
8.8%
i686427
8.8%
a445000
 
5.7%
o292223
 
3.8%
r262425
 
3.4%
-224463
 
2.9%
U222421
 
2.9%
Other values (23)1635425
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5233545
67.4%
Space Separator1403750
 
18.1%
Uppercase Letter896164
 
11.5%
Dash Punctuation224463
 
2.9%
Open Punctuation2042
 
< 0.1%
Close Punctuation2042
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1054185
20.1%
e848890
16.2%
n686797
13.1%
i686427
13.1%
a445000
8.5%
o292223
 
5.6%
r262425
 
5.0%
v202675
 
3.9%
d201003
 
3.8%
s201003
 
3.8%
Other values (10)352917
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
U222421
24.8%
S222421
24.8%
N217759
24.3%
B202675
22.6%
F21788
 
2.4%
P3714
 
0.4%
A2042
 
0.2%
R1672
 
0.2%
O1672
 
0.2%
Space Separator
ValueCountFrequency (%)
1403750
100.0%
Dash Punctuation
ValueCountFrequency (%)
-224463
100.0%
Open Punctuation
ValueCountFrequency (%)
(2042
100.0%
Close Punctuation
ValueCountFrequency (%)
)2042
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin6129709
79.0%
Common1632297
 
21.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t1054185
17.2%
e848890
13.8%
n686797
11.2%
i686427
11.2%
a445000
 
7.3%
o292223
 
4.8%
r262425
 
4.3%
U222421
 
3.6%
S222421
 
3.6%
N217759
 
3.6%
Other values (19)1191161
19.4%
Common
ValueCountFrequency (%)
1403750
86.0%
-224463
 
13.8%
(2042
 
0.1%
)2042
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII7762006
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1403750
18.1%
t1054185
13.6%
e848890
10.9%
n686797
8.8%
i686427
8.8%
a445000
 
5.7%
o292223
 
3.8%
r262425
 
3.4%
-224463
 
2.9%
U222421
 
2.9%
Other values (23)1635425
21.1%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.4 MiB
0
203090 
2
 
18345
1
 
3028

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters224463
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row2
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0203090
90.5%
218345
 
8.2%
13028
 
1.3%

Length

2021-12-30T00:43:52.904874image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:52.938319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0203090
90.5%
218345
 
8.2%
13028
 
1.3%

Most occurring characters

ValueCountFrequency (%)
0203090
90.5%
218345
 
8.2%
13028
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number224463
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0203090
90.5%
218345
 
8.2%
13028
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Common224463
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0203090
90.5%
218345
 
8.2%
13028
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII224463
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0203090
90.5%
218345
 
8.2%
13028
 
1.3%

FillIncVeteransAdmin
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.6 MiB
Not in universe
222206 
No
 
1813
Yes
 
444

Length

Max length16
Median length16
Mean length15.87126163
Min length3

Characters and Unicode

Total characters3562511
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe222206
99.0%
No1813
 
0.8%
Yes444
 
0.2%

Length

2021-12-30T00:43:53.025775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:53.062158image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
not222206
33.2%
in222206
33.2%
universe222206
33.2%
no1813
 
0.3%
yes444
 
0.1%

Most occurring characters

ValueCountFrequency (%)
668875
18.8%
e444856
12.5%
i444412
12.5%
n444412
12.5%
N224019
 
6.3%
o224019
 
6.3%
s222650
 
6.2%
t222206
 
6.2%
u222206
 
6.2%
v222206
 
6.2%
Other values (2)222650
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2669173
74.9%
Space Separator668875
 
18.8%
Uppercase Letter224463
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e444856
16.7%
i444412
16.6%
n444412
16.6%
o224019
8.4%
s222650
8.3%
t222206
8.3%
u222206
8.3%
v222206
8.3%
r222206
8.3%
Uppercase Letter
ValueCountFrequency (%)
N224019
99.8%
Y444
 
0.2%
Space Separator
ValueCountFrequency (%)
668875
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2893636
81.2%
Common668875
 
18.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e444856
15.4%
i444412
15.4%
n444412
15.4%
N224019
7.7%
o224019
7.7%
s222650
7.7%
t222206
7.7%
u222206
7.7%
v222206
7.7%
r222206
7.7%
Common
ValueCountFrequency (%)
668875
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3562511
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
668875
18.8%
e444856
12.5%
i444412
12.5%
n444412
12.5%
N224019
 
6.3%
o224019
 
6.3%
s222650
 
6.2%
t222206
 
6.2%
u222206
 
6.2%
v222206
 
6.2%
Other values (2)222650
 
6.2%

VeteransBenefits
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.4 MiB
2
168917 
0
53289 
1
 
2257

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters224463
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2168917
75.3%
053289
 
23.7%
12257
 
1.0%

Length

2021-12-30T00:43:53.155321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:53.188813image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
2168917
75.3%
053289
 
23.7%
12257
 
1.0%

Most occurring characters

ValueCountFrequency (%)
2168917
75.3%
053289
 
23.7%
12257
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number224463
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2168917
75.3%
053289
 
23.7%
12257
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Common224463
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2168917
75.3%
053289
 
23.7%
12257
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII224463
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2168917
75.3%
053289
 
23.7%
12257
 
1.0%

WeeksWorkedInYear
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.18621332
Minimum0
Maximum52
Zeros107852
Zeros (%)48.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2021-12-30T00:43:53.233168image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median8
Q352
95-th percentile52
Maximum52
Range52
Interquartile range (IQR)52

Descriptive statistics

Standard deviation24.39983132
Coefficient of variation (CV)1.052342225
Kurtosis-1.863321616
Mean23.18621332
Median Absolute Deviation (MAD)8
Skewness0.2091461191
Sum5204447
Variance595.3517685
MonotonicityNot monotonic
2021-12-30T00:43:53.303900image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0107852
48.0%
5278986
35.2%
403132
 
1.4%
502529
 
1.1%
262524
 
1.1%
482130
 
0.9%
122060
 
0.9%
301570
 
0.7%
201533
 
0.7%
361264
 
0.6%
Other values (43)20883
 
9.3%
ValueCountFrequency (%)
0107852
48.0%
1509
 
0.2%
2504
 
0.2%
3476
 
0.2%
4841
 
0.4%
5312
 
0.1%
6724
 
0.3%
7162
 
0.1%
81264
 
0.6%
9283
 
0.1%
ValueCountFrequency (%)
5278986
35.2%
51926
 
0.4%
502529
 
1.1%
49603
 
0.3%
482130
 
0.9%
47320
 
0.1%
46756
 
0.3%
45777
 
0.3%
44971
 
0.4%
43436
 
0.2%

Year
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.6 MiB
94
112309 
95
112154 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters448926
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row95
2nd row94
3rd row95
4th row95
5th row94

Common Values

ValueCountFrequency (%)
94112309
50.0%
95112154
50.0%

Length

2021-12-30T00:43:53.417749image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:53.450758image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
94112309
50.0%
95112154
50.0%

Most occurring characters

ValueCountFrequency (%)
9224463
50.0%
4112309
25.0%
5112154
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number448926
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9224463
50.0%
4112309
25.0%
5112154
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common448926
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9224463
50.0%
4112309
25.0%
5112154
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII448926
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9224463
50.0%
4112309
25.0%
5112154
25.0%

Target
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.4 MiB
0
210466 
1
 
13997

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters224463
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0210466
93.8%
113997
 
6.2%

Length

2021-12-30T00:43:53.536873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-30T00:43:53.569837image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0210466
93.8%
113997
 
6.2%

Most occurring characters

ValueCountFrequency (%)
0210466
93.8%
113997
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number224463
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0210466
93.8%
113997
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Common224463
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0210466
93.8%
113997
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII224463
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0210466
93.8%
113997
 
6.2%

Interactions

2021-12-30T00:43:34.782432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:34.885181image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:34.957372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.031615image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.102246image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.174483image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.247274image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.396463image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.472821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.549089image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.623317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.697746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.769140image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.836990image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.906776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:35.973986image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.041642image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.109472image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.175901image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.247554image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.320026image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.390177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.460079image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.533448image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.603458image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.675529image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.744639image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.815107image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.886010image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:36.955232image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.030024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.104478image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.176883image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.249684image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.399352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.468620image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.539243image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.606961image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.675968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.745042image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.812355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.885658image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:37.958918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.029997image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.100988image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.174637image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.244427image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.316218image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.385502image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.455456image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.525881image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.595160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.669611image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.744455image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.816511image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.889167image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:38.963104image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.033357image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.104766image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.174168image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.244598image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.315239image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.385011image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.459622image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.534607image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.606954image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.679806image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.751473image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.923348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:39.993140image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.060499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.128869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.197129image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.264189image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.336699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.409376image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.479494image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.550250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.624493image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.694740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.767095image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.837824image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.909403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:40.980510image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.050436image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.125267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.201281image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.273854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.348268image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.426863image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.502124image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.579103image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.652854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.727987image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.803632image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.878346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:41.955925image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.034489image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.111566image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.187820image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.263730image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.336818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.411221image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.482835image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.555699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.628487image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.700128image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.777051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:42.994427image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.069314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.144760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.219162image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.289389image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.361161image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.430589image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.501392image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.571973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.641147image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.715847image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.790535image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2021-12-30T00:43:43.862904image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2021-12-30T00:43:53.607591image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-12-30T00:43:53.706246image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-12-30T00:43:53.804259image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-12-30T00:43:53.927243image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-12-30T00:43:54.121226image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-12-30T00:43:44.431444image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-12-30T00:43:45.347290image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-12-30T00:43:46.556701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-12-30T00:43:46.874147image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

IDAgeClassOfWorkerIndustryCodeOccupationCodeEducationWagePerHourEnrollInEdUInstlastWkMaritalStatusMajorIndustryCodeMajorOccupationCodeRaceHispanicOriginSexMemberOfALaborUnionReasonForUnemploymentFullOrPartTimeEmploymentStatCapitalGainsCapitalLossesDividendsFromStocksTaxFilerStatRegionOfPreviousResidenceStateOfPreviousResidenceDetailedHholdAndFamStatDetailedHholdSumInHholdInstanceWeightMigCodeChangeInMsaMigCodeChangeInRegMigCodeMoveWithinRegLiveInThisHouse1YearAgoMigPrevResInSunbeltNumOfPersonsWorkForEmployerFamilyMembersUnder18CntryOfBirthFatherCntryOfBirthMotherCntryOfBirthSelfCitizenshipOwnBusinessOrSelfEmployedFillIncVeteransAdminVeteransBenefitsWeeksWorkedInYearYearTarget
04032742Not in universe0010th grade0Not in universeMarried-civilian spouse presentNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeNot in labor force000NonfilerNot in universeNot in universeHouseholderHouseholder1005.05NaNNaNNaNNot in universe under 1 year oldNaN0Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe20950
120091326Private193911th grade0Not in universeNever marriedManufacturing-nondurable goodsTransportation and material movingWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000SingleSouthArkansasSecondary individualNonrelative of householder1707.39MSA to MSASame countySame countyNoYes6Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States2Not in universe251940
222182135Self-employed-incorporated3932High school graduate0Not in universeMarried-civilian spouse presentPersonal services except private HHOther serviceWhiteAll otherMaleNot in universeNot in universeFull-time schedules000Joint both under 65Not in universeNot in universeHouseholderHouseholder2399.42NaNNaNNaNNot in universe under 1 year oldNaN2Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252950
312113863Not in universe00High school graduate0Not in universeWidowedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeNot in labor force000SingleNot in universeNot in universeSecondary individualNonrelative of householder100.34NaNNaNNaNNot in universe under 1 year oldNaN0Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe20950
425779127Private453Masters degree(MA MS MEng MEd MSW MBA)0Not in universeNever marriedOther professional servicesExecutive admin and managerialWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild 18+ never marr Not in a subfamilyChild 18 or older2147.89NonmoverNonmoverNonmoverYesNot in universe0Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe20940
55809354Private1235High school graduate0Not in universeMarried-civilian spouse presentManufacturing-durable goodsPrecision production craft & repairWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000Joint both under 65Not in universeNot in universeSpouse of householderSpouse of householder988.21NonmoverNonmoverNonmoverYesNot in universe6Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252940
611188351Self-employed-incorporated422Masters degree(MA MS MEng MEd MSW MBA)0Not in universeMarried-civilian spouse presentMedical except hospitalExecutive admin and managerialWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces1502400Joint both under 65Not in universeNot in universeHouseholderHouseholder2450.89NonmoverNonmoverNonmoverYesNot in universe2Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252941
78245163Private3319High school graduate0Not in universeMarried-civilian spouse presentRetail tradeSalesWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces00600Joint both under 65Not in universeNot in universeSpouse of householderSpouse of householder1116.61NonmoverNonmoverNonmoverYesNot in universe2Not in universeItalyItalyUnited-StatesNative- Born in the United States0Not in universe252940
82033949Private323Bachelors degree(BA AB BS)0Not in universeMarried-civilian spouse presentWholesale tradeExecutive admin and managerialWhiteAll otherFemaleNot in universeOther job loserChildren or Armed Forces000Joint both under 65Not in universeNot in universeSpouse of householderSpouse of householder273.31NonmoverNonmoverNonmoverYesNot in universe1Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States2Not in universe252940
92863154Self-employed-not incorporated3924High school graduate0Not in universeMarried-civilian spouse presentPersonal services except private HHAdm support including clericalWhiteAll otherFemaleNot in universeNot in universeFull-time schedules000Joint both under 65Not in universeNot in universeSpouse of householderSpouse of householder1545.30NaNNaNNaNNot in universe under 1 year oldNaN1Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252950

Last rows

IDAgeClassOfWorkerIndustryCodeOccupationCodeEducationWagePerHourEnrollInEdUInstlastWkMaritalStatusMajorIndustryCodeMajorOccupationCodeRaceHispanicOriginSexMemberOfALaborUnionReasonForUnemploymentFullOrPartTimeEmploymentStatCapitalGainsCapitalLossesDividendsFromStocksTaxFilerStatRegionOfPreviousResidenceStateOfPreviousResidenceDetailedHholdAndFamStatDetailedHholdSumInHholdInstanceWeightMigCodeChangeInMsaMigCodeChangeInRegMigCodeMoveWithinRegLiveInThisHouse1YearAgoMigPrevResInSunbeltNumOfPersonsWorkForEmployerFamilyMembersUnder18CntryOfBirthFatherCntryOfBirthMotherCntryOfBirthSelfCitizenshipOwnBusinessOrSelfEmployedFillIncVeteransAdminVeteransBenefitsWeeksWorkedInYearYearTarget
22445320700147Private112High school graduate0Not in universeMarried-civilian spouse presentManufacturing-durable goodsExecutive admin and managerialWhiteAll otherMaleNot in universeNot in universeFull-time schedules0010Joint both under 65Not in universeNot in universeHouseholderHouseholder261.92NaNNaNNaNNot in universe under 1 year oldNaN2Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States2Not in universe252951
2244544000317Not in universe0011th grade0High schoolNever marriedNot in universe or childrenNot in universeBlackAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1388.81NonmoverNonmoverNonmoverYesNot in universe0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe20940
22445518178949Private3319High school graduate1000Not in universeDivorcedRetail tradeSalesWhiteMexican-AmericanFemaleYesNot in universeChildren or Armed Forces000SingleNot in universeNot in universeRP of unrelated subfamilyNonrelative of householder1026.93NonmoverNonmoverNonmoverYesNot in universe6Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252940
2244561579350Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married3775.92NaNNaNNaNNot in universe under 1 year oldNaN0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe00950
22445729723848Private2926Some college but no degree0Not in universeMarried-civilian spouse presentTransportationAdm support including clericalWhiteAll otherFemaleNot in universeNot in universeFull-time schedules0050Joint both under 65Not in universeNot in universeSpouse of householderSpouse of householder1916.28NaNNaNNaNNot in universe under 1 year oldNaN4Not in universeCanadaCanadaCanadaForeign born- Not a citizen of U S0Not in universe252950
22445878186Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married231.38NaNNaNNaNNot in universe under 1 year oldNaN0Mother only presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe00950
224459709938Private293811th grade0Not in universeNever marriedTransportationTransportation and material movingAmer Indian Aleut or EskimoOther SpanishMaleNot in universeNot in universeFull-time schedules000SingleNot in universeNot in universeNonfamily householderHouseholder736.17NaNNaNNaNNot in universe under 1 year oldNaN6Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252950
2244602100519Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married3111.26NaNNaNNaNNot in universe under 1 year oldNaN0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe00950
22446128397250Not in universe007th and 8th grade0Not in universeSeparatedNot in universe or childrenNot in universeWhiteMexican-AmericanFemaleNot in universeNot in universeNot in labor force000NonfilerNot in universeNot in universeHouseholderHouseholder1368.82NaNNaNNaNNot in universe under 1 year oldNaN0Not in universeMexicoUnited-StatesUnited-StatesNative- Born in the United States0Not in universe20950
22446211253354Private293811th grade0Not in universeDivorcedTransportationTransportation and material movingWhiteAll otherMaleNot in universeNot in universeFull-time schedules000SingleNot in universeNot in universeNonfamily householderHouseholder1093.05NaNNaNNaNNot in universe under 1 year oldNaN1Not in universeCanadaCanadaCanadaForeign born- Not a citizen of U S0Not in universe252950